专利摘要:
Compositions and methods for using programmable DNA-binding proteins to increase the efficiency and / or specificity of modifying the target genome or to facilitate the detection of specific genomic loci in eukaryotic cells.
公开号:BR112018074531B1
申请号:R112018074531-6
申请日:2017-02-20
公开日:2021-01-19
发明作者:Fuqiang Chen
申请人:Sigma-Aldrich Co. Llc;
IPC主号:
专利说明:

FIELD
[001] The present invention relates to compositions and methods for increasing the efficiency and / or specificity of modification of the target genome. BACKGROUND
[002] Programmable endonucleases have increasingly become an important tool for engineering or modifying the target genome in eukaryotes. Recently, (Cas) (CRISPR / Cas) systems associated with (CRISPR) / CRISPR of regularly interspersed short palindromic repeats grouped by RNA have emerged as a new generation of genome modification tools. These new programmable endonucleases have greatly improved the genome's editing ability compared to previous generations of nucleases, such as zinc finger nucleases (ZFNs) and transcription activator-like effector nucleases (TALENs).
[003] However, not all genomic targets are accessible to efficient modification by these programmable endonucleases. In fact, some CRISPR-Cas endonucleases appear to have little or no activity in human cells. Among other things, the chromatin structure can present a barrier to these programmable endonucleases and prevent them from binding to the target sequence. Thus, there is a need to improve the accessibility of these programmable endonucleases to the target sequences and / or to improve the efficiency of the modification of the target genome. In addition, there is a need to increase specificity for modifying the target genome, reducing off-target effects. SUMMARY
[004] Among the various aspects of the present invention is a composition comprising (a) a programmable DNA-modifying protein or nucleic acid encoding the programmable DNA-modifying protein and (b) at least one programmable DNA-binding protein or acid nucleic encoding at least one programmable DNA binding protein. In general, the programmable DNA modification protein has nuclease activity (that is, it cleaves both strands of a double stranded sequence) or non-nuclease activity (for example, epigenetic modification activity or transcriptional regulatory activity) and hair at least one programmable DNA-binding protein lacked nuclease activity.
[005] In modalities in which the programmable DNA modification protein has nuclease activity, for example, the programmable DNA modification protein can be selected from a nuclease system (Cas) (CRISPR / Cas) associated with (CRISPR) / CRISPR of regularly interleaved short palindromic repeats grouped together guided by RNA, a CRISPR / Cas dual nicase system, a zinc finger nuclease (ZFN), a transcription activator-like effector nuclease (TALEN), a meganuclease, a fusion protein comprising a programmable DNA binding domain linked to a nuclease domain (i.e., it generates a double stranded DNA disruption) and combinations thereof.
[006] In embodiments in which the programmable DNA modification protein has non-nuclease activity, for example, the programmable DNA modification protein may be a fusion protein comprising a programmable DNA binding domain linked to a non-modifying domain nuclease. In certain embodiments, the programmable DNA binding domain of the fusion protein can be a catalytically inactive CRISPR / Cas system, a catalytically inactive meganuclease, a zinc finger protein or a transcription activator-like effector and the non-nuclease modification domain of the fusion protein can have acetyltransferase activity, deacetylase activity, methyltransferase activity, demethylase activity, kinase activity, phosphatase activity, ubiquitin ligase activity, deubiquitination activity, adenylation activity, delenylation activity, SUMOilation activity , deSUMOylation activity, ribosylation activity, de-ribosylation activity, myristoylation activity, de-myristoylation activity, citrullination activity, helicase activity, amination activity, deamination activity, alkylation activity, dealkylation activity, oxidation activity, activity transcriptional activation or rep activity transcriptional resora. In specific embodiments, the non-nuclease modification domain of the fusion protein has cytosine deaminase activity, histone acetyltransferase activity, transcriptional activation activity or transcriptional repressive activity.
[007] According to certain modalities of the compositions described herein, the at least one programmable DNA-binding protein can be a catalytically inactive CRISPR / Cas system, a catalytically inactive meganuclease, a zinc finger protein, an activator-like effector of transcription, a CRISPR / Cas nicase, a ZFN nicase, a TALEN nicase or a meganuclease nicase.
[008] In general, the nucleic acid encoding the programmable DNA modification protein and / or at least one programmable DNA binding protein is mRNA or DNA. In some embodiments, the nucleic acid encoding the programmable DNA modification protein and / or at least one programmable DNA binding protein is part of a vector such as, for example, a plasmid vector, a lentiviral vector, a vector adeno-associated viral or an adenoviral vector.
[009] In specific embodiments, the programmable DNA modification protein comprises a CRISPR / Cas nuclease system, a CRISPR / Cas dual nicase system, or a catalytically inactive CRISPR / Cas system linked to a non-nuclease domain and at least a programmable DNA binding protein comprises a catalytically inactive CRISPR / Cas, wherein each CRISPR / Cas system comprises a CRISPR / Cas protein and a guide RNA. In several modalities, each CRISPR / Cas nuclease system can be a CRISPR / Cas type I system, a CRISPR / Cas type II system, a CRISPR / Cas type III system or a CRISPR / Cas type V system. In some modalities, each Guide RNA can be at least partially chemically synthesized. In other embodiments, each guide RNA can be enzymatically synthesized. In other embodiments, the nucleic acid encoding each CRISPR / Cas protein can be mRNA and the nucleic acid encoding each guide RNA can be DNA. In yet other embodiments, the nucleic acid encoding each CRISPR / Cas protein can be mRNA and the nucleic acid encoding each guide RNA can be DNA. In certain respects, the nucleic acid encoding the CRISPR / Cas protein and / or the nucleic acid encoding the guide RNA may be part of a vector, for example, a plasmid vector, a lentiviral vector, an adeno-associated viral vector or a adenoviral vector.
[0010] Another aspect of the present invention encompasses kits comprising any one or more of the compositions detailed above.
[0011] Yet another aspect of the present description provides methods for increasing the efficiency of modifying the target genome and / or specificity in a eukaryotic cell. The methods involve introducing into a eukaryotic cell (a) a programmable DNA modification protein or nucleic acid encoding the programmable DNA modification protein and (b) at least one programmable DNA binding protein or nucleic acid encoding at least one programmable DNA binding protein. The programmable DNA modification protein is targeted at a target chromosomal sequence and each of at least one programmable DNA binding protein is targeted at a site proximal to the target chromosomal sequence. The binding of at least one programmable DNA-binding protein to the site proximal to the target chromosomal sequence increases the accessibility of the programmable DNA-modifying protein to the target chromosomal sequence, thereby increasing the efficiency and / or specificity of modifying the target genome. The proximal site linked by each of at least one programmable DNA-binding protein is located, for example, within about 250 base pairs on either side of the target chromosomal sequence. In some embodiments, the proximal binding site is located less than about 200 bp or less than about 100 bp on each side of the target chromosomal sequence.
[0012] The programmable DNA modification protein used in the method may be a CRISPR / Cas nuclease system, a CRISPR / Cas dual nicase system, a zinc finger nuclease (ZFN), a transcription activator-like nuclease effector ( TALEN), a meganuclease, a fusion protein comprising a programmable DNA binding domain linked to a nuclease domain, or a fusion protein comprising a programmable DNA binding domain linked to a non-nuclease domain. The fusion protein's programmable DNA binding domain can be a catalytically inactive CRISPR / Cas system, a catalytically inactive meganuclease, a zinc finger protein, or a transcription activator-like effector, and the non-nuclease modification domain of the protein fusion may have acetyltransferase activity, deacetylase activity, methyltransferase activity, demethylase activity, kinase activity, phosphatase activity, ubiquitin ligase activity, deubiquitination activity, adenylation activity, delenylation activity, SUMOylation activity, activity deSUMOylation, ribosylation activity, de-ribosylation activity, myristoylation activity, demyistoylation activity, citrullination activity, helicase activity, amination activity, deamination activity, alkylation activity, dealkylation activity, oxidation activity, activation activity transcriptional or transcriptional repressive activity al. In specific embodiments, the non-nuclease modification domain of the fusion protein has cytosine deaminase activity, histone acetyltransferase activity, transcriptional activation activity or transcriptional repressive activity.
[0013] The at least one programmable DNA-binding protein used in the method binds to DNA, but has no nuclease activity (i.e., double-stranded cleavage activity). In certain embodiments, the at least one programmable DNA-binding protein can be a catalytically inactive CRISPR / Cas system, a catalytically inactive meganuclease, a zinc finger protein, a transcription activator-like effector, a CRISPR / Cas nicase, a ZFN nicase, a TALEN nicase or a meganuclease nicase.
[0014] In specific embodiments, the programmable DNA modification protein comprises a CRISPR / Cas nuclease system, a CRISPR / Cas dual nicase system, or a catalytically inactive CRISPR / Cas system linked to a non-nuclease domain, and at least a programmable DNA-binding protein comprises a catalytically inactive CRISPR / Cas system, wherein each CRISPR / Cas system comprises a CRISPR / Cas protein and a guide RNA.
[0015] In various modalities, at least two, at least three, or more than three programmable DNA-binding proteins are introduced into the eukaryotic cell. In specific embodiments, the eukaryotic cell is a mammalian cell or a human cell.
[0016] Another aspect of the present invention encompasses methods for detecting a chromosomal sequence or genomic locus in a eukaryotic cell. The methods involve introducing into the eukaryotic cell (a) a programmable DNA binding protein, comprising at least one detectable marker domain or nucleic acid encoding the programmable DNA binding protein, comprising at least one detectable marker domain and (b) at least a programmable DNA-binding protein, or nucleic acid encoding at least one programmable DNA-binding protein, wherein the programmable DNA-binding protein comprising at least one detectable marker domain is directed to a target chromosomal sequence and each of the at least one programmable DNA binding protein intended for a site proximal to the target chromosomal sequence, wherein binding of at least one programmable DNA binding protein to the site proximal to the target chromosomal sequence increases the accessibility of the programmable DNA binding protein comprising at least one detectable marker domain to the target chromosomal sequence. The methods may further involve the detection of the programmable DNA-binding protein comprising at least one detectable marker domain linked to the target chromosomal sequence. The detection step can be in live or fixed cells and can involve, for example, dynamic imaging of living cells, fluorescent microscopy, confocal microscopy, immunofluorescence, immunodetection, RNA-protein binding or protein-protein binding.
The programmable DNA binding protein comprising at least one detectable marker domain that is used in the detection method comprises a programmable DNA binding domain, which can be a catalytically inactive CRISPR / Cas system, a catalytically inactive meganuclease, a zinc finger protein or a transcription activator-like effector. The at least one detectable marker domain of the programmable DNA binding protein, which comprises at least one detectable marker domain can be, for example, a fluorescent protein, a fluorescent marker, an epitope marker, or a naturally occurring epitope within the programmable DNA binding protein. In some embodiments, the programmable DNA-binding protein comprising at least one detectable marker domain may further comprise a non-nuclease modification. At least one programmable DNA-binding protein binds to DNA but has no nuclease activity (i.e., double-stranded cleavage activity). In some embodiments, the programmable DNA-binding protein may be a catalytically inactive CRISPR / Cas system, a catalytically inactive meganuclease, a zinc finger protein, a transcription activator-like effector, a CRISPR / Cas nicase, a ZFN nicase , a TALEN nicase or a meganuclease nicase. In specific embodiments, the programmable DNA binding protein, comprising at least one detectable marker domain can be a catalytically inactive CRISPR / Cas system linked to at least one detectable marker domain, and at least one programmable DNA binding protein, can be a catalytically inactive CRISPR / Cas system.
[0018] Other aspects and characteristics of the invention are detailed below. BRIEF DESCRIPTION OF THE FIGURES
[0019] FIG. 1 provides a diagram of an embodiment of the methods described herein. Proximal binding of programmable DNA binding protein (s) increases the accessibility of the target site to a programmable nuclease, thereby increasing the efficiency of cleavage at the target site.
[0020] FIG. 2 illustrates that the binding of catalytically inactive SpCas9 (SpdCas9) to the proximal site (s) increases the efficiency of FnCas9 cleavage. The sequences shown at the top show the relative locations of the target FnCas9 site at the POR locus and the SpdCas9 binding sites. The results of a Cel-I nuclease assay are shown at the bottom.
[0021] FIG. 3A illustrates the design of an experiment to determine whether binding of catalytically inactive SpCas9 (SpdCas9) increases the accessibility and binding of catalytically inactive (i.e. FLAG-labeled) CjCas9 (CjdCas9) to a previously inaccessible site on the POR locus.
[0022] FIG. 3B provides a diagram of the chromatin immunoprecipitation binding assay used to detect epitope-labeled CjdCas9 binding to target sites at the POR and AAVS1 loci.
[0023] FIG. 3 illustrates that the binding of SppCas9 to proximal sites increases the binding of epitope-labeled CjCas9 to a site previously inaccessible at the POR locus.
[0024] FIG. 4 illustrates that the binding of catalytically inactive SpCas9 (SpdCas9) to proximal sites increases the efficiency of CjCas9 cleavage. The sequences shown at the top show the relative locations of the CjCas9 target site at the POR locus and at the SpdCas9 binding sites. The results of a Cel-I nuclease assay are shown at the bottom.
[0025] FIG. 5 illustrates that the binding of catalytically inactive SpCas9 (SpdCas9) to proximal sites increases the efficiency of FnCpf1 cleavage. The relative locations of the FnCpf1 target site and SpdCas9 binding sites at the POR locus are illustrated at the top and the results of a Cel-I nuclease assay are shown at the bottom.
[0026] FIG. 6 illustrates that the binding of catalytically inactive SpCas9 (SpdCas9) to the proximal site (s) increases specific cleavage by CjCas9. The target sites for CjCas9 at the HBD and HBB loci, as well as the SpdCas9 binding sites at the HBB locus, are shown at the top. The results of a Cel-I nuclease assay are shown at the bottom.
[0027] FIG. 7 illustrates that the binding of catalytically inactive FnCas9 (FndCas9) to proximal sites increases specific cleavage by SpCas9. The relative locations of the SpCas9 target site and the FndCas9 binding sites at the POR locus are indicated at the top. The results of a Cel-I nuclease assay are shown at the bottom.
[0028] FIG. 8 illustrates the enhancement of oligo-mediated ssDNA gene editing. The relative locations of the target sites at the POR locus and the sequence of the ssDNA oligo are shown at the top. The results of the integration directed to the EcoRI website are shown at the bottom. The integration efficiencies of the EcoRI site (%) were determined by ImageJ. M: DNA markers with wide reach. ND: not determined. DETAILED DESCRIPTION
[0029] The present description provides compositions and methods to increase the accessibility of chromosomal DNA to target endonucleases and other programmable DNA modification proteins, where increased accessibility leads to the efficiency and / or specificity of modifying the target genome or epigenetic modification increased. Some CRISPR / Cas endonucleases have been found to have little or no activity in human cells. It is possible that occupation, positioning of the nucleosome and how a DNA sequence is wrapped around the histone octamer can determine how accessible the sequence is to a DNA-binding protein (Cherejiet al., Briefing Functional Genomics, 2014, 14: 506-60). Thus, it is possible that the impediment imposed by the configuration of the local chromatin may play a role in the apparent inactivity of many CRISPR / Cas endonucleases in human cells. It has been found, as detailed here, that binding of DNA-binding proteins to sites located proximal (i.e., within about 250 base pairs) to the target site of a target DNA-modifying protein increases the accessibility of the protein modification of target DNA to the target site, this method increasing the efficiency and / or specificity of modifying the target genome or target epigenetic modification. The compositions and methods described herein, therefore, enable efficient target genome modification / epigenetic modification using CRISPR / Cas endonucleases that were previously thought to be inactive in human cells. In addition, the compositions and methods described herein also improve selective genome modification among nearly identical target sites, thereby reducing off-target effects. (1) Compositions
[0030] One aspect of the present invention provides compositions comprising (a) programmable DNA modification proteins or nucleic acid encoding the programmable DNA modification proteins and (b) at least one programmable DNA binding protein or nucleic acid encoding the hair least one programmable DNA-binding protein. Programmable DNA modification proteins are detailed below in section (I) (a), programmable DNA binding proteins are detailed below in section (I) (b) and the nucleic acids encoding these proteins are detailed below in section (I) ) (ç). (a) Programmable DNA Modification Proteins
[0031] A programmable DNA modification protein is a protein that binds to a specific target sequence in chromosomal DNA and modifies DNA or a DNA-associated protein in or near the sequence. Thus, a programmable DNA modification protein comprises a DNA binding domain and a catalytically active modification domain.
[0032] The DNA binding domain is programmable, in that it can be planned or modified to recognize and link different DNA sequences. In some embodiments, for example, DNA binding is mediated by the interaction between the protein and the target DNA. In this way, the DNA binding domain can be programmed to bind to a DNA sequence of interest for protein engineering. In other embodiments, for example, DNA binding mediated by a guide RNA that interacts with the programmable DNA binding domain of the protein and the target DNA. In such cases, the programmable DNA binding domain can be targeted to a DNA sequence of interest, designating the appropriate guide RNA.
[0033] A variety of modification domains can be included in programmable DNA modification proteins. In some embodiments, the modification domain is a nuclease domain, which has nuclease activity and cleaves both strands of a double stranded DNA sequence (that is, it generates a double strand break). The double-strand break can then be repaired by a cellular DNA repair process, such as non-homologous edge splicing (NHEJ) or homology-directed (HDR) repair. As a consequence, the DNA sequence can be modified by deleting, inserting and / or replacing at least one base pair up to, for example, many thousands of base pairs. Examples of programmable DNA modification proteins comprising nuclease domains include, without limitation, CRISPR / Cas nuclease systems, CRISPR / Cas dual nicase systems, zinc finger nucleases, transcription activator-like effector nucleases, meganucleases, fusion proteins comprising a nuclease domain linked to a programmable DNA binding domain, and combinations thereof. Programmable DNA modification proteins comprising nuclease domains are detailed below in sections (I) (a) (i) - (vi).
[0034] In other embodiments, the programmable DNA modification protein modification domain has non-nuclease activity (eg, epigenetic modification activity or transcriptional regulation activity) so that the programmable DNA modification protein modifies the structure and / or activity of DNA and / or protein (s) associated with DNA. Thus, the programmable DNA modification protein is a fusion protein comprising a non-nuclease modification domain linked to a programmable DNA binding domain. Such proteins are detailed below in section (I) (a) (vii).
[0035] Programmable DNA modification proteins may comprise binding and / or modifying domains of wild-type or naturally occurring DNA, modified versions of binding domains and / or modifying naturally occurring DNA, binding domains and / or modification of synthetic or artificial DNA, and combinations thereof. (i) CRISPR / Cas Nuclease Systems
[0036] In some embodiments, the programmable DNA modification protein may be an RNA-guided CRISPR / Cas nuclease system, which introduces a double strand break in the DNA. The CRISPR / Cas nuclease system comprises a CRISPR / Cas nuclease and a guide RNA.
[0037] CRISPR / Cas Nuclease. In certain embodiments, the CRISPR / Cas nuclease can be derived from a CRISPR system type I (ie, IA, IB, IC, ID, IE, or IF), type II (ie, IIA, IIB or IIC), type III (ie, IIIA or IIIB), or type V, which are present in various bacteria and archaea. For example, the CRISPR / Cas system can be Streptococcus sp. (for example, Streptococcus pyogenes), Campylobacter sp. (for example, Campylobacter jejuni), Francisella sp. (for example, Francisella novicida), Acaryocloris sp., Acetohalobium sp., Acidaminococcus sp., Acidithiobacillus sp., Alicyclobacillus sp., Allochromatium sp., Ammonifex sp., Anabaena sp., Arthrospira sp., Bacillus sp., Burkholder . Caldicelulosiruptor sp., Candidatus sp., Clostridium sp., Crocosphaera sp., Cyanothece sp., Exiguobacterium sp., Finegoldia sp., Ktedonobacter sp., Lachnospiraceaesp., Lactobacillus sp., Lyngbya sp., Marinobacterium. , Microscilla sp., Microcoleus sp., Microcystis sp., Natranaerobius sp., Neisseria sp., Nitrosococcus sp., Nocardiopsis sp., Nodularia sp., Nostoc sp., Oscillatoria sp., Polaromonas sp., Pelotomaculum sp., Pseudoalteromonas . sp., Petrotoga sp., Prevotella sp., Staphylococcus sp., Streptomyces sp., Streptosporangium sp., Synechococcus sp., Thermosipho sp. or Verrucomicrobia sp .. In still other modalities, the CRISPR / Cas nuclease can be derived from an archaeal CRISPR system, a CRISPR-CasX system or a CRISPR-CasY system (Burstein et al., Nature, 2017, 542 (7640): 237-241).
[0038] In a particular embodiment, the CRISPR / Cas nuclease can be derived from a CRISPR / Cas type I system. In another particular embodiment, the CRISPR / Cas nuclease can be derived from a CRISPR / Cas type II system. In another particular embodiment, the CRISPR / Cas nuclease can be derived from a CRISPR / Cas type III system. In another particular embodiment, the CRISPR / Cas nuclease can be derived from a CRISPR / Cas type V system.
[0039] Non-limiting examples of suitable CRISPR proteins include Cas proteins, Cpf proteins, C2c proteins (e.g., C2c1, C2c2, Cdc3), Cmr proteins, Csa proteins, Csb proteins, Csc proteins, Cse proteins, Csm proteins , Csn proteins, Csx proteins, Csy proteins, Csz proteins, and their derivatives or variants. In specific embodiments, the CRISPR / Cas nuclease can be a Cas9 type II protein, a Cpf1 type V protein or a derivative thereof.
[0040] In some embodiments, the CRISPR / Cas nuclease may be Streptococcus pyogenes Cas9 (SpCas9) or Streptococcus thermophilus Cas9 (StCas9). In other embodiments, the CRISPR / Cas nuclease may be Campylobacter jejuni Cas9 (CjCas9). In alternative modalities, the CRISPR / Cas nuclease can be Franscisella novicida Cas9 (FnCas9). In still other modalities, the CRISPR / Cas nuclease can be Neisseria cinerea Cas9 (NcCas9). In other embodiments, the CRISPR / Cas nuclease may be Francisella novicida Cpf1 (FnCpf1), Acidaminococcus sp. Cpf1 (AsCpf1), or Lachnospiraceae bacterium ND2006 Cpf1 (LbCpf1).
[0041] In general, the CRISPR / Cas nuclease comprises an RNA recognition and / or RNA binding domain, which interacts with the guide RNA. The CRISPR / Cas nuclease also comprises at least one nuclease domain that has endonuclease activity. For example, a Cas9 protein comprises a RuvC-like nuclease domain and an HNH-like nuclease domain, and a Cpf1 protein comprises a RuvC-like domain. CRISPR / Cas nucleases can also comprise DNA binding domains, helicase domains, RNase domains, protein-protein interaction domains, dimerization domains, as well as other domains.
[0042] The CRISPR / Cas nuclease can further comprise at least one signal of nuclear localization, cell penetration domain and / or marker domain. Non-limiting examples of nuclear location signals include PKKKRKV (SEQ ID NO: 1), PKKKRRV (SEQ ID NO: 2), KRPAATKKAGQAKKKK (SEQ ID NO: 3), YGRKKRRQRRR (SEQ ID NO: 28, RKKRRQRRR (SEQ ID NO: 29), PAAKRVKLD (SEQ ID NO: 30), RQRRNELKRSP (SEQ ID NO: 31), VSRKRPRP (SEQ ID NO: 32), PPKKARED (SEQ ID NO: 33), PQPKKKPL (SEQ ID NO: 34), SALIKKKKKMAP ( SEQ ID NO: 35), PKQKKRK (SEQ ID NO: 36), RKLKKKIKKL (SEQ ID NO: 37), REKKKFLKRR (SEQ ID NO: 38), KRKGDEVDGVDEVAKKKSKK (SEQ ID NO: 39), RKCLQAGMNLEARKKK IDK: 40 ), NQSSNFGPMKGGNFGGRSSGPYGGGGQYFAKPRNQGGY (SEQ ID NO: 41), and RMRIZFKNKGKDTAELRRRRVEVSVELRKAKKDEQILKRRNV (SEQ ID NO: 42). , GALFLGWLGAAGSTMGAPKKKRKV (SEQ ID NO: 6), GALFLGFLGAAGSTMGAWSQPKKKRKV (SEQ ID NO: 7), KETWWETWWTEWSQPKKKRKV (SEQ ID NO: 8), YARAAARQARA (SER NO: 43) (SER ID: 43) NO: 45), RRQR RTSKLMKR (SEQ ID NO: 46), GWTLNSA GYLLGKINLKALAALAKKIL (SEQ ID NO: 47), KALAWEAKLAKALAKALAKHLAKALAKALKCEA (SEQ ID NO: 48) and RQIKIWFQNRRMKWKK (SEQ ID NO: 49) Marker domains include fluorescent proteins and purification or epitope markers. Suitable fluorescent proteins include, but are not limited to, green fluorescent proteins (for example, GFP, eGFP, GFP-2, tagGFP, turboGFP, Emerald, Azami Green, Monomatic Azami Green, CopGFP, AceGFP, ZsGreen1), yellow fluorescent proteins (for example , YFP, EYFP, Citrus, Venus, YPet, PhiYFP, ZsYellow1), blue fluorescent proteins (for example, BFP, EBFP, EBFP2, Azurite, mKalama1, GFPuv, sapphire, T-sapphire), cyan fluorescent proteins (for example, ECFP , Cerulean, CyPet, AmCyan1, Midoriishi-Cyan), red fluorescent proteins (for example, mKate, mKate2, mPlum, monomer DsRed, mCherry, mRFP1, DsRed- Express, DsRed2, DsRed-Monomer, HcRed-Tandem, HcRed-Tandem, HcRed-Tandem, HcRed-Tandem, HcRed-Tandem, HcRed-Tandem, HcRed-Tandem, HcRed-Tandem, HcRed-Tandem, HcRed-Tandem, HcRed-Tandem, eqFP611, mRasberry, mStrawberry, Jred), and orange fluorescent proteins (for example, mOrange, mKO, Kusabira-Orange, Kusabira-Orange monomeric, mTangerine, tdTomato). Non-limiting examples of suitable purification markers or epitope include 6xHis, FLAGH, HA, GST, Myc and the like.
[0043] The nuclear localization signal, the cell penetration domain and / or the marker domain can be located at the N termination, at the C termination or at an internal location of the protein. In some embodiments, the CRISPR / Cas nuclease may further comprise at least one detectable marker. The detectable marker can be a fluorophore (for example, FAM, TMR, Cy3, Cy5, Texas Red, Oregon Green, Alexa Fluores, Halo markers or suitable fluorescent marker / dye), a chromophore (for example, biotin, digoxigenin and the like) , quantum dots or gold particles. The detectable marker can be linked by conventional means to any amino acid in the protein.
[0044] RNA Guide. The CRISPR / Cas nuclease system also comprises a guide RNA (gRNA). The guide RNA interacts with the CRISPR / Cas nuclease and the target site to guide the CRISPR / Cas nuclease to the target site in the chromosomal sequence. The target site has no sequence limitation, except that the sequence is limited by an adjacent protospace motif (PAM). For example, PAM sequences for Cas9 proteins include 3'-NGG, 3'-NGGNG, 3'-NNAGAAW and 3'-ACAY, and PAM sequences for Cpf1 include 5'-TTN (where N is defined as any nucleotide, W is defined as A or T and Y is defined as C or T).
[0045] Each guide RNA can comprise three regions: a first region at the 5 'end that has complementarity with the target site in the chromosomal DNA sequence, a second region that is internal and forms a stem-loop structure, and a third region at the 3 'end which remains essentially single-stranded. The second and third regions form a secondary structure that interacts with the CRISPR / Cas protein. The first region of each guide RNA is different (that is, it is sequence specific). The second and third regions can be the same in guide RNAs that complex with a particular CRISPR / Cas protein.
[0046] The first region of the guide RNA has complementarity with the sequence (i.e., protospacer sequence) at the target site, so that the first region of the guide RNA can form a base pair with the target sequence. For example, the first region of a SpCas9 guide RNA can comprise GN17-20GG. In general, the complementarity between the first region (i.e., crRNA) of the guide RNA and the target sequence is at least 80%, at least 85%, at least 90%, at least 95%, or more. In various embodiments, the first guide RNA region can comprise from about 10 nucleotides to more than about 25 nucleotides. For example, the base pairing region between the first guide RNA region and the target site in the cDNA sequence can be approximately 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 , 22, 23, 24, 25 or more than 25 nucleotides in length. In an exemplary embodiment, the first guide RNA region is about 19, 20 or 21 nucleotides in length.
[0047] The guide RNA also comprises a second region that forms a secondary structure. In some embodiments, the secondary structure comprises at least one rod (or hairpin) and a handle. The length of each handle and stem may vary. For example, the handle can vary from about 3 to about 10 nucleotides in length, and the stem can vary from about 6 to about 20 base pairs in length. The stem may comprise one or more protrusions from 1 to about 10 nucleotides. Thus, the total length of the second region can vary from about 16 to about 60 nucleotides in length. The guide RNA also comprises a third region at the 3 'end that remains essentially single-stranded. Thus, the third region has no complementarity with any nucleic acid sequence in the cell of interest and has no complementarity with the rest of the guide RNA. The length of the third region may vary. In general, the third region is more than about 4 nucleotides in length. For example, the length of the third region can vary from about 5 to about 60 nucleotides in length.
[0048] The combined length of the second and third regions (also called the universal or structure region) of the guide RNA can vary from about 30 to about 120 nucleotides in length. In one aspect, the combined length of the second and third regions of the guide RNA ranges from about 70 to about 100 nucleotides in length.
[0049] In still other modalities, the second and third regions of the guide RNA may comprise one or more additional stem-loop regions, where the stem-loop regions comprise aptamer sequences (Konermannet al., Nature 3, 2015, 517 (7536)): 583-588; Zalatan et al., Cell, 2015, 160 (1-2): 339-50). Suitable aptamer sequences include those that bind chosen adapter proteins from MS2, PP7, COM, Q2, F2, GA, fr, JP501, M12, R17, BZ13, JP34, JP500, KU1, M11, MX1, TW18, VK, SP, FI ID2, NL95, TW19, AP205, ΦCb5, ΦCb8r, ΦCb12r, ΦCb23r, 7s, PRR1, HSF1, AID, APOBEC1, p300, TET1 / 2/3, VP64, GFP, Rta, p65, MyoD1 or VP160. In such embodiments, the total length of the second and third regions of the guide RNA can vary up to about 125 nucleotides, up to about 150 nucleotides, up to about 175 nucleotides, up to about 200 nucleotides, up to about 225 nucleotides, up to about 250 nucleotides, up to about 275 nucleotides, or up to about 300 nucleotides.
[0050] In some embodiments, the guide RNA can be a single molecule comprising all three regions. In other embodiments, the guide RNA can comprise two separate molecules. The first RNA molecule (i.e., crRNA) can comprise the first guide RNA region and half of the "stem" of the second guide RNA region. The second RNA molecule (i.e., tracrRNA) can comprise the other half of the "stem" of the second guide RNA region and the third guide RNA region. Thus, in this embodiment, the first and second RNA molecules each contain a sequence of nucleotides that are complementary to each other. For example, in one embodiment, the crRNA and tracrRNA RNA molecules each comprise a sequence (from about 6 to about 20 nucleotides) that form base pairs with the other sequence to form a functional guide RNA. For example, the CRISPR / Cas type II guide RNA can comprise crRNA and tracrRNA. In some respects, the crRNA for a CRISPR / Cas type II system can be chemically synthesized and the tracrRNA for the CRISPR / Cas type II system can be synthesized in vitro (see section (I) (c) below). In other embodiments, the CRISPR / Cas type V guide RNA may comprise only crRNA.
The guide RNA can comprise standard ribonucleotides, modified ribonucleotides (e.g., pseudouridine), ribonucleotide isomers and / or ribonucleotide analogues. In some embodiments, the guide RNA may further comprise at least one detectable marker. The detectable marker can be a fluorophore (for example, FAM, TMR, Cy3, Cy5, Texas Red, Oregon Green, Alexa Fluores, Halo markers, or suitable fluorescent dye), a chromophore (for example, biotin, digoxigenin and the like), quantum dots or gold particles. Those skilled in the art are familiar with gRNA planning and construction, for example, gRNA planning tools are available on the internet or from commercial sources.
[0052] The guide RNA can be chemically synthesized, enzymatically synthesized or a combination of these. For example, the guide RNA can be synthesized using phosphoramidite-based solid phase synthesis methods. Alternatively, the guide RNA can be synthesized in vitro operably by linking the DNA encoding the guide RNA to a promoter control sequence that is recognized by a phage RNA polymerase. Examples of suitable phage promoter sequences include T7, T3, SP6 promoter sequences, or variations thereof. In embodiments in which the guide RNA comprises two separate molecules (i.e., crRNA and tracrRNA), the crRNA can be chemically synthesized and the tracrRNA can be enzymatically synthesized. The nucleic acid encoding the guide RNA can be part of a plasmid vector, which can further comprise additional expression control sequences (for example, enhancer sequences, Kozak sequences, polyadenylation sequences, transcription termination sequences, etc.), marker sequences, selectable (eg antibiotic-resistant genes), origins of replication, and the like. As detailed below in section (I) (c), the nucleic acid encoding the guide RNA can be operably linked to a promoter control sequence that is recognized by RNA polymerase III (Pol III) for expression in eukaryotic cells. (ii) Dual CRISPR / Cas Nicase Systems
[0053] In other modalities, the programmable DNA modification protein can be a dual CRISPR / Cas nicase system. The CRISPR / Cas dual nicase systems are similar to the CRISPR / Cas nuclease systems described above in section (I) (a) (i), except that the CRISPR / Cas nuclease is modified to cleave only one strand of DNA. In this way, a single CRISPR / Cas nicase system creates a single strand break or a double strand DNA notch, and a paired CRISPR / Cas nicase system comprising paired stem guide RNAs creates a double strand break in the DNA .
[0054] A CRISPR / Cas nuclease can be converted into a nicase by one or more mutations and / or deletions. For example, a Cas9 nicase may comprise one or more mutations in one of the nuclease domains (for example, the RuvC-like domain or the HNH-like domain). For example, one or more mutations can be D10A, D8A, E762A and / or D986A in the RuvC-like domain or one or more mutations can be H840A, H559A, N854A, N856A and / or N863A in the HNH-like domain. (iii) Zinc Finger Nucleases
[0055] In yet other modalities, the programmable DNA modification protein can be a zinc finger nuclease (ZFN). A ZFN comprises a DNA-binding zinc finger region and a nuclease domain. The region of the zinc finger can comprise about two to seven zinc fingers, for example, about four to six zinc fingers, where each zinc finger binds to three nucleotides. The zinc finger region can be modified to recognize and bind to any DNA sequence. Zinc finger planning tools or algorithms are available on the internet or from commercial sources. The zinc fingers can be linked together using suitable binding sequences.
[0056] A ZFN also comprises a nuclease domain, which can be obtained from any endonuclease or exonuclease. Non-limiting examples of endonucleases from which a nuclease domain can be derived include, but are not limited to, restriction endonucleases and parent endonucleases. In some embodiments, the nuclease domain can be derived from a type II-S restriction endonuclease. Type II-S endonucleases cleave DNA at sites that are typically several base pairs away from the recognition / binding site and, as such, have separable binding and cleavage domains. These enzymes are usually monomers that are transiently associated to form dimers to cleave each strand of DNA at staggered locations. Non-limiting examples of suitable type II-S endonucleases include BfiI, BpmI, BsaI, Bsgl, BsmBI, BsmI, BspMI, FokI, MboII and SapI. In some embodiments, the nuclease domain can be a FokI nuclease domain or a derivative thereof. The type II-S nuclease domain can be modified to facilitate the dimerization of two different nuclease domains. For example, the FokI cleavage domain can be modified by mutating certain amino acid residues. As a non-limiting example, amino acid residues at positions 446, 447, 479, 483, 484, 486, 487, 490, 491, 496, 498, 499, 500, 531, 534, 537 and 538 of FokI domains nuclease are targets for modification. For example, one modified FokI domain can comprise Q486E, I499L and / or N496D mutations, and the other modified FokI domain can comprise E490K, I538K and / or H537R mutations.
[0057] The ZFN may further comprise at least one nuclear localization signal, cell penetration domain and / or marker domain, which are described above in section (I) (a) (i). (iv) Effector Nucleases similar to Transcription Activator
[0058] In alternative modalities, the programmable DNA modification protein can be an effector nuclease similar to transcription activator (TALEN). TALENs comprise a DNA binding domain composed of highly conserved repeats derived from transcriptional activator-like effectors (TALEs) that are linked to a nuclease domain. TALEs are proteins secreted by the plant pathogen Xanthomonas to alter the transcription of genes in host plant cells. The TALE repeat arrays can be modified through modular protein planning to target any DNA sequence of interest. The nuclease domain of TALENs can be any nuclease domain as described above in section (I) (a) (iii). In specific modalities, the nuclease domain is derived from FokI (Sanjana et al., 2012, Nat Protoc, 7 (1): 171-192).
[0059] TALEN can also comprise at least one nuclear localization signal, cell penetration domain, marker domain and / or detectable marker, which are described above in section (I) (a) (i). (v) Rare Cut Meganucleases or Endonucleases
[0060] In yet other embodiments, the programmable DNA-modifying protein can be a meganuclease or a derivative thereof. Meganucleases are endodeoxyribonucleases characterized by long recognition sequences, i.e., the recognition sequence generally ranges from about 12 base pairs to about 45 base pairs. As a consequence of this requirement, the recognition sequence generally occurs only once in any given genome. Among meganucleases, the homing endonuclease family, called LAGLIDADG, has become a valuable tool for the study of genomes and genome engineering. In some embodiments, the meganuclease can be I-SceI, I-TevI or their variants. A meganuclease can be targeted to a specific chromosomal sequence by modifying its recognition sequence using techniques well known to those skilled in the art.
[0061] In alternative modalities, the programmable DNA modification protein can be a rare cut endonuclease or its derivative. Rare-cut endonucleases are site-specific endonucleases whose recognition sequence occurs rarely in a genome, preferably only once in a genome. The rare-cut endonuclease can recognize a 7-nucleotide sequence, an 8-nucleotide sequence or a longer recognition sequence. Non-limiting examples of rare-cut endonucleases include NotI, AscI, PacI, AsiSI, SbfI and FseI.
[0062] The rare-cut meganuclease or endonuclease may also comprise at least one nuclear localization signal, cell penetration domain, marker domain and / or detectable marker, which are described above in section (I) (a) (i). (vi) Programmable Fusion Proteins Comprising Nuclease Domains
[0063] In still further embodiments, the programmable DNA modification protein can be a fusion protein comprising a programmable DNA binding domain linked to a nuclease domain (double-stranded cleavage). The nuclease domain of the fusion protein can be any of those described above in section (I) (a) (iii), a nuclease domain derived from a CRISPR / Cas nuclease (e.g., RuvC-like or similar nuclease domains the Cas9 HNH or Cpf1 nuclease domain), or a rare-cut meganuclease or endonuclease-derived nuclease domain.
[0064] The programmable DNA binding domain of the fusion protein can be a programmable endonuclease (i.e., CRISPR / CAS nuclease or meganuclease) modified to lack all nuclease activity. Thus, the DNA binding domain of the fusion protein can be a catalytically inactive CRISPR / Cas system or a catalytically inactive meganuclease. Alternatively, the programmable DNA binding domain of the fusion protein can be a programmable DNA binding protein, such as, for example, a zinc finger protein or a transcription activator-like effector. In some embodiments, the programmable DNA binding domain may be a catalytically inactive CRISPR / Cas nuclease in which the nuclease activity has been eliminated by mutation and / or deletion. For example, the catalytically inactive CRISPR / Cas protein may be a catalytically inactive (dead) Cas9 (dead) (dCas9) in which the RuvC-like domain comprises a D10A, D8A, E762A, and / or D986A-like mutation HNH comprises an H840A, H559A, N854A, N865A and / or N863A mutation. Alternatively, the catalytically inactive CRISPR / Cas protein may be a catalytically inactive (killed) Cpf1 protein comprising comparable mutations in the nuclease domain. In yet other embodiments, the programmable DNA binding domain may be a catalytically inactive meganuclease in which the nuclease activity has been eliminated by mutation and / or deletion, for example, the catalytically inactive meganuclease may comprise a C-termination truncation.
[0065] The fusion protein comprising the nuclease activity can also comprise at least one nuclear localization signal, cell penetration domain, marker domain and / or detectable marker, which are described above in section (I) (a) (i ). (vii) Programmable / Complex Fusion Proteins Comprising Non-Nuclease Domains
[0066] In alternative embodiments, the programmable DNA modification protein may be a fusion protein comprising a programmable DNA binding domain linked to a non-nuclease modification domain. Suitable programmable DNA binding domains are described above in section (I) (a) (vi).
[0067] In some embodiments, the non-nuclease modification domain may be an epigenetic modification domain, which alters the structure of DNA or chromatin (and may or may not alter the DNA sequence). Non-limiting examples of suitable epigenetic modification domains include those with DNA methyltransferase activity (eg, cytosine methyltransferase), DNA demethylase activity, DNA deamination (eg, cytosine deaminase, adenosine deaminase, guanine deaminase), DNA amination, DNA helicase activity, histone acetyltransferase (HAT) activity (e.g., HAT domain derived from the E1A p300 binding protein), histone deacetylase activity, histone methyltransferase activity, histone demethylase activity, histone kinase activity, histone activity phosphatase, ubiquitin ligase histone activity, histone deubiquitination activity, histone adenylation activity, histone deadenylation activity, histone SUMOylation activity, histone desSUMOylation activity, histone ribosylation activity, histone deribosylation activity, histone myristoylation activity , histone demystylation activity, activ histone citrulination age, histone alkylation activity, histone dealkylation activity or histone oxidation activity. In specific embodiments, the epigenetic modification domain may comprise cytosine deaminase activity, histone acetyltransferase activity or DNA methyltransferase activity.
[0068] In other modalities, the non-nuclease modification domain can be a transcriptional activation domain or transcriptional repressive domain. Suitable transcriptional activation domains include, without limitation, the herpes simplex virus VP16 domain, VP64 activation domains (which is a tetrameric derivative of VP16), VP160, NFKB p65, activation domains p53 1 and 2, CREB activation domains (cAMP response element binding protein), E2A activation domains, human thermal shock factor 1 (HSF1) activation domain or NFAT activation domain (activated T cell nuclear factor). Non-limiting examples of suitable transcriptional repressor domains include inducible cAMP early repressor domains (ICER), Kruppel-associated box A repressor domains (KRAB-A), YY1 glycine-rich repressor domains, Sp1-like repressors, E (spl) repressors , IkB repressor, or MeCP2. The transcriptional activation domains or transcriptional repressors can be genetically fused to the DNA-binding protein or linked through non-covalent protein-protein, protein-RNA or protein-DNA interactions.
[0069] In modalities in which the programmable DNA modification protein comprises a CRISPR / Cas system, the CRNAPR / Cas system guide RNA may comprise sequences of aptamers that bind transcriptional activators, transcriptional repressors, or epigenetic modification proteins (Konermannet al ., Nature, 2015, 517 (7536): 583-588, Zalatanet al., Cell, 2015, 160 (1-2): 339-50).
The fusion protein comprising non-nuclease activity can also comprise at least one nuclear localization signal, cell penetration domain, marker domain and / or detectable marker, which are described above in section (I) (a) (i) . (b) Programmable DNA-binding proteins
[0071] The composition also comprises at least one programmable DNA binding protein. Programmable DNA binding proteins are proteins that bind to specific DNA sequences, but do not modify the DNA or protein (s) associated with the DNA.
[0072] In some embodiments, at least one programmable DNA-binding protein can be a CRISPR / Cas nuclease modified to have no nuclease activity. For example, the programmable DNA-binding protein may be a catalytically inactive CRISPR / Cas system. For this, the CRISPR / Cas nuclease can be modified by mutation and / or deletion to eliminate all nuclease activity. In one embodiment, the RuvC-like domain and the HNH-like domain both comprise one or more mutations and / or deletions to eliminate nuclease activity. For example, the catalytically inactive CRISPR / Cas protein may be a catalytically inactive (dead) Cas9 (dCas9) in which the RuvC-like domain comprises a D10A, D8A, E762A, and / or D986A mutation and the HNH-like domain comprises a mutation H840A, H559A, N854A, N856A and / or N863A. Alternatively, the catalytically inactive CRISPR / Cas protein may be a catalytically inactive (killed) Cpf1 protein comprising comparable mutations in the nuclease domain. In other respects, the programmable DNA-binding protein may be a CRISPR / Cas protein modified to cut a strand from a double stranded sequence (i.e., a nicase), as detailed above in section (I) (a) ( ii).
[0073] In other embodiments, the at least one programmable DNA-binding protein may be a catalytically inactive meganuclease in which the nuclease activity has been eliminated by mutation and / or deletion, for example, the catalytically inactive meganuclease may comprise a truncation of C termination. In yet other modalities, the at least one programmable DNA-binding protein may be a zinc finger protein or an effector similar to a transcription activator (TALE). In additional embodiments, the at least one programmable DNA-binding protein can be a CRISPR / Cas nicase, a ZFN nicase, a TALEN nicase or a meganuclease nicase. ZFN, TALEN and meganuclease nicases comprise mutations and / or deletions in one of the nuclease domains or half of the domains, in such a way that the nicase cleaves only one strand of a double stranded sequence.
The programmable DNA binding protein can also comprise at least one nuclear localization signal, cell penetration domain, marker domain and / or detectable marker, which are described above in section (I) (a) (i). (c) Nucleic acids encoding programmable DNA-modifying proteins or programmable DNA-binding proteins
[0075] The nucleic acid encoding the programmable DNA modification protein, described above in section (I) (a), or the programmable DNA binding protein, described above in section (I) (b), can be DNA or RNA, linear or circular, single-stranded or double-stranded. RNA or DNA can be codon-optimized for efficient translation into protein in the eukaryotic cell of interest. Codon optimization programs are available as freeware or from commercial sources.
[0076] In some embodiments, the nucleic acid encoding the programmable DNA modification protein or at least one programmable DNA binding protein can be mRNA. The mRNA can be synthesized in vitro. For this, the DNA encoding the DNA modifying protein or at least one DNA binding protein can be operably linked to a promoter sequence that is recognized by a phage RNA polymerase for in vitro mRNA synthesis. For example, the promoter sequence can be a T7, T3 or SP6 promoter sequence or a variation of a T7, T3 or SP6 promoter sequence. In such embodiments, the RNA transcribed in vitro can be purified, buffered and / or polyadenylated. As detailed below, the DNA encoding the DNA-modifying protein or the DNA-binding protein is part of a vector.
[0077] In other embodiments, the nucleic acid encoding the programmable DNA modification protein or at least one programmable DNA binding protein can be DNA. The DNA sequence encoding the programmable DNA modification protein or at least one programmable DNA binding protein can be operably linked to at least one promoter control sequence for expression in the cell of interest. In some embodiments, the DNA coding sequence can also be linked to a polyadenylation signal (for example, SV40 polyA signal, bovine growth hormone (BGH) polyA signal, etc.) and / or at least one sequence termination of the transcription.
[0078] In certain embodiments, the DNA coding sequence can be operably linked to a promoter sequence for expression of the DNA modifying protein or the DNA binding protein in bacterial cells (e.g., E. coli) or eukaryotic cells (e.g., yeast, insect or mammal). Suitable bacterial promoters include, without limitation, T7 promoters, lac operon promoters, trp promoters, tac promoters (which are hybrids of trp and lac promoters), variations of any of the above and combinations of any of the foregoing. Non-limiting examples of suitable eukaryotic promoters include constitutive, regulated or cell or tissue specific promoters. Control sequences of the eukaryotic constitutive promoter include, but are not limited to, immediate cytomegalovirus (CMV) early promoter, simian virus promoter (SV40), adenovirus major late promoter, Rous sarcoma virus promoter (RSV), promoter of mouse mammary tumor virus (MMTV) phosphoglycerate kinase (PGK) promoter, elongation factor (ED1) -alpha promoter, ubiquitin promoters, actin promoters, tubulin promoters, immunoglobulin promoters, their fragments or combinations of any of the above. Examples of suitable eukaryotic regulated promoter control sequences include, without limitation, those regulated by heat shock, metals, steroids, antibiotics or alcohol. Non-limiting examples of tissue-specific promoters include the B29 promoter, CD14 promoter, CD43 promoter, CD45 promoter, CD68 promoter, desmin promoter, elastase-1 promoter, endoglobin promoter, fibronectin promoter, Flt-1 promoter, GFAP promoter, GPIIb promoter, ICAM-2 promoter, INF-β promoter, Mb promoter, NphsI promoter, OG-2 promoter, SP-B promoter, SYN1 promoter and WASP promoter. The promoter sequence can be wild-type or can be modified for more efficient or effective expression.
[0079] In several embodiments, the nucleic acid encoding the programmable DNA-modifying protein and / or at least one programmable DNA-binding protein can be present in a vector. Suitable vectors include plasmid, phagemid, cosmid, artificial / minichromosome vectors, transposons and viral vectors (eg, lentiviral vectors, adeno-associated viral vectors, adenoviral vectors, etc.). In one embodiment, the DNA encoding the programmable DNA modification protein and / or at least one programmable DNA binding protein can be present in a plasmid vector. Non-limiting examples of suitable plasmid vectors include pUC, pBR322, pET, pBluescript and their variants. In other embodiments, the nucleic acid encoding the programmable DNA modification protein and / or at least one programmable DNA binding protein may be present in a viral vector. The viral vector or plasmid can comprise additional expression control sequences (for example, enhancer sequences, Kozak sequences, polyadenylation sequences, transcription termination sequences, etc.), selectable marker sequences (for example, antibiotic resistant genes) , origins of replication, and the like. Additional information can be found in "Current Protocols in Molecular Biology" Ausubel et al., John Wiley & Sons, New York, 2003 or "Molecular Cloning: A Laboratory Manual" Sambrook & Russell, Cold Spring Harbor Press, Cold Spring Harbor, NY , 3a. edition, 2001.
[0080] In embodiments in which the programmable DNA modification protein and / or at least one programmable DNA binding protein comprises a CRISPR / Cas protein or its variant, the expression vector comprising nucleic acid encoding the DNA modification protein programmable and / or at least one programmable DNA binding protein, can further comprise a sequence encoding one or more guide RNAs. The sequence encoding the guide RNA is generally operably linked to at least one transcriptional control sequence for expression of the guide RNA in the eukaryotic cell of interest. For example, the nucleic acid encoding the guide RNA can be operably linked to a promoter sequence that is recognized by RNA polymerase III (Pol III). Examples of suitable Pol III promoters include, but are not limited to, U6, U3, H1 and 7SL mammalian RNA promoters. d) Specific compositions
[0081] In some embodiments, the programmable DNA modification protein and one or more programmable DNA binding proteins are provided as proteins (or, in some cases, as protein-RNA complexes). Programmable DNA modification proteins and programmable DNA binding proteins can be expressed in bacterial or eukaryotic cells and purified using means well known in the art. In other embodiments, the programmable DNA modification protein and one or more programmable DNA binding proteins are provided as coding nucleic acids.
[0082] In some embodiments, the composition may comprise a programmable DNA binding protein / system or encoding nucleic acids. In other embodiments, the composition may comprise two programmable DNA binding proteins / systems or coding nucleic acids. In yet other embodiments, the composition may comprise three programmable DNA binding proteins / systems or coding nucleic acids. In other embodiments, the composition may comprise four programmable DNA binding proteins / systems or encoding nucleic acids. In still other embodiments, the composition may comprise five or more programmable proteins / DNA binding systems or encoding nucleic acids.
[0083] In specific embodiments, the programmable DNA modification protein may comprise a CRISPR / Cas system (for example, CRISPR / Cas nuclease, CRISPR / Cas dual nicase or catalytically inactive (dead) CRISPR / Cas protein linked to a domain of non-nuclease modification) and the programmable DNA-binding protein can be a CRISPR / Cas system that lacks nuclease activity. For example, the programmable DNA-binding protein can be catalyzed by the inactive CRISPR / Cas system. In general, each CRISPR / Cas protein comprises at least one nuclear localization signal. In some interactions, the composition can comprise CRISPR / Cas systems as CRISPR / Cas proteins and guide RNA, where the protein and RNA can be separate entities or the protein and RNA can be complexed with each other. The guide RNA can be at least partially chemically synthesized. The guide RNA can be enzymatically synthesized. In other interactions, the composition may comprise the CRISPR / Cas proteins and the DNA encoding the guide RNAs. In still other interactions, the composition can comprise mRNA that encodes the CRISPR / Cas proteins and DNA that encodes the guide RNA. In still other interactions, the composition may comprise plasmids or viral vectors encoding the CRISPR / Cas proteins and / or the guide RNAs. In certain embodiments, the catalytically active CRISPR / Cas protein and the catalytically inactive (dead) CRISPR / Cas protein are Cas9 proteins. The nucleic acids that encode CRISPR / Cas proteins are generally encoded for optimal expression in the eukaryotic cell of interest. (vii) Kits
[0084] Another aspect of the present invention provides kits comprising the compositions detailed above in section (I). The kits can provide programmable DNA modification protein and at least one programmable DNA binding protein as proteins, as protein-RNA complexes, or as nucleic acids that encode the various components, as detailed above. The kits may further comprise transfection reagents, cell growth media, selection media, in vitro transcription reagents, nucleic acid purification reagents, protein purification reagents, buffers and the like. The kits provided here generally include instructions for performing the methods detailed below. The instructions included in the kits can be affixed to the packaging material or can be included as an information leaflet. Although the instructions are typically written or printed materials, they are not limited to them. Any means capable of storing such instructions and communicating them to an end user is contemplated by this invention. Such media include, but are not limited to, electronic storage vehicles (for example, magnetic disks, tapes, cartridges, chips), optical vehicles (for example, CD-ROM), and the like. As used here, the term "instructions" can include the address of a website that provides instructions.
[0085] In some embodiments, the programmable DNA modification protein and / or at least one programmable DNA binding protein from the kit may comprise a CRISPR / Cas type II system. In certain embodiments, the guide RNA of the CRISPR / Cas type II system may comprise crRNA and tracrRNA. The kit, therefore, can provide the universal tracrRNA (s), and the end user of the kit can provide the sequence-specific crRNA (s). In some respects, the kit may comprise the CRISPR / Cas type II protein (s) and tracrRNA (s). In other respects, the kit may comprise mRNA or DNA encoding the CRISPR / Cas type II protein (s) and DNA encoding the RNAtrac (s).
[0086] In yet other embodiments, the programmable DNA modification protein and / or at least one programmable DNA binding protein in the kit may comprise a CRISPR / Cas type V system. As detailed above, the guide RNA of the CRISPR systems / Cas type V comprises only crRNA. In some respects, the kit may comprise CRISPR / Cas type V proteins and crRNA (s), or the kit may comprise mRNA or DNA encoding CRISPR / Cas type V proteins and DNA encoding crRNA (s). In other respects, the kit may comprise only the CRISPR / Cas type V protein (s) or nucleic acid encoding the CRISPR / Cas type V protein (s), where the end user of the kit provides the ( s) crRNA (s). (viii) Methods to Increase Accessibility to Target Chromosomal Sites
[0087] Another aspect of the present invention encompasses methods to increase the efficiency and / or specificity of the target genome / epigenetic modification in eukaryotic cells by increasing the accessibility of a programmable DNA modifying protein to its target sequence in chromosomal DNA. The methods comprise introducing into the eukaryotic cell of interest (a) a programmable DNA modification protein or nucleic acid encoding the programmable DNA modification protein and (a) at least one programmable DNA binding protein or nucleic acid encoding at least a programmable DNA-binding protein. The programmable DNA modification protein modified to recognize and bind to a target sequence in chromosomal DNA, in which site the DNA modification protein can modify the DNA or associated protein (s). Each or more of the programmable DNA-binding proteins is modified to recognize and bind a proximal sequence to the target chromosomal sequence of the DNA-modifying protein. Programmable DNA modification proteins and programmable DNA binding proteins are detailed above in section (I).
[0088] In general, the sequence proximal to the target chromosomal sequence is located within about 250 base pairs on each side (that is, upstream or downstream) of the target chromosomal sequence. The proximal site (s) can be located on any of the double strands of DNA. In some embodiments, the sequence proximal to the target chromosomal sequence may be located less than about 250 bp, less than about 200 bp, less than about 150 bp, less than about 100 bp, less than about 75 bp, less than about 50 bp less than about 25 bp, less than about 20 bp, less than about 15 bp, less than about 10 bp or less than about 5 bp of the target chromosomal sequence from the modifying protein of DNA. In certain embodiments, the sequence proximal to the target chromosomal sequence can be located from about 1 bp to about 10 bp, from about 11 bp to about 20 bp, from about 21 bp to about 30 bp, from about 31 bp to about 40 bp from about 41 bp to about 50 bp, from about 51 bp to about 60 bp, from about 61 bp to about 70 bp, from about 71 bp to about 80 bp, from about 81 bp to about 90 bp, from about 91 bp to about 100 bp, from about 101 bp to about 150 bp, from about 151 bp to about 200 bp, or about 201 bp to about 250 bp on either side of the target chromosomal sequence. In other embodiments, the sequence proximal to the target chromosomal sequence can be located from about 5 bp to about 75 bp, from about 10 bp to about 50 bp, or from about 15 bp to about 25 bp of each side of the target chromosomal sequence.
[0089] In some embodiments, the method comprises introducing into the cell at least one programmable DNA-binding protein whose binding sequence is located upstream or downstream of the target chromosomal sequence. In other embodiments, the method comprises introducing into the cell at least two programmable DNA binding proteins, where the binding sequence of one is located upstream of the target chromosomal sequence and the binding sequence of the other is located downstream of the target chromosomal sequence. In other embodiments, the method comprises introducing into the cell at least three programmable DNA binding proteins whose binding sequences are located upstream or downstream of the target chromosomal sequence. In additional embodiments, the method comprises introducing into the cell four or more programmable DNA-binding proteins whose binding sequences are located upstream or downstream of the target chromosomal sequence. In these embodiments, for example, the method may comprise the introduction of one, two three, four, five, six, seven, eight, nine, ten or more than ten programmable DNA binding proteins whose binding sequences are located within about 250 bp on each side. (that is, upstream or downstream) of the target chromosomal sequence.
[0090] Binding of one or more programmable DNA-binding proteins to the site proximal to the target chromosome sequence changes the configuration of the local chromatin, leading to increased accessibility of the programmable DNA-modifying protein to the target chromosome sequence (previously inaccessible) (see Figure 1). As a consequence, the modification efficiency by the DNA modifying protein is increased (see, for example, Examples 13). In other words, the modification efficiency by a DNA modifying protein is increased when the DNA modifying protein is introduced into the cell in combination with one or more programmable DNA binding proteins compared to when the DNA modifying protein introduced into the cell alone.
[0091] In addition, the methods described here increase the specificity of modification of the target genome. Although the programmable DNA modification protein is modified to recognize and link a target sequence at a specific chromosomal locus, identical or nearly identical sequences can exist at other chromosomal locations (resulting in off-target effects). In embodiments in which the binding of a programmable DNA-modifying protein to a target chromosomal sequence largely depends on the binding of one or more programmable DNA-binding proteins to sequences proximal to the target chromosomal sequence, the binding of one or more binding proteins programmable DNA at site (s) proximal to the target sequence at the chromosomal locus of interest, however, provides additional specificity for the modification event (see Example 4).
[0092] Thus, the methods described here can increase the efficiency and / or specificity of editing the target genome (for example, gene corrections, gene knockouts, activation genes and the like), targeted epigenetic modifications and transcription regulation directed. (a) Introduction to the Cell
[0093] As described, the method comprises introducing into the cell (b) a programmable DNA modification protein or nucleic acid encoding the programmable DNA modification protein and (b) at least one programmable DNA binding protein or nucleic acid encoding at least one programmable DNA-binding protein. Programmable DNA modification proteins are detailed above in section (I) (a), programmable DNA binding proteins are detailed above in section (I) (b) and nucleic acids encoding the DNA modification proteins or the binding protein programmable DNA are described above in section (I) (c).
[0094] The programmable DNA modification protein or nucleic acid encoding the programmable DNA modification protein and at least one programmable DNA binding protein or nucleic acid encoding at least one programmable DNA binding protein can be introduced into the cell of interest for a variety of methods.
[0095] In some embodiments, the cell can be transfected with the appropriate molecules (ie, protein, DNA and / or RNA). Suitable transfection methods include nucleofection (or electroporation), calcium phosphate-mediated transfection, cationic polymer transfection (for example, DEAE-dextran or polyethyleneimine), viral transduction, virosome transfection, virus transfection, liposome transfection, transfection with liposome cationic liposome, transfection with immunoliposomes, transfection of non-liposomal lipid, transfection of dendrimers, transfection by thermal shock, magnetofection, lipofection, gene gun release, impalefection, sonoporation, optical transfection and enhanced uptake of nucleic acid proprietary agent. Transfection methods are well known in the art (see, for example, "Current Protocols in Molecular Biology" Ausubelet at al., John Wiley & Sons, New York, 2003 or "Molecular Cloning: A Laboratory Manual" Sambrook & Russell, Cold Spring Harbor Press, Cold Spring Harbor, NY, 3rd edition, 2001). In other embodiments, the molecules can be introduced into the cell by microinjection. For example, the molecules can be injected into the cytoplasm or nuclei of the cells of interest. The amount of each molecule introduced into the cell can vary, but those skilled in the art are familiar with the methods for determining the appropriate amount.
[0096] The various molecules can be introduced into the cell simultaneously or sequentially. For example, the programmable DNA modification protein (or its encoding nucleic acid) and at least one programmable DNA binding protein (or encoding nucleic acid) can be introduced at the same time. Alternatively, one can be introduced first and then the other can be introduced later in the cell.
[0097] In general, the cell is maintained under conditions suitable for cell growth and / or maintenance. Suitable cell culture conditions are well known in the art and are described, for example, in Santiago et al., Proc. Natl. Acad. Sci. USA, 2008, 105: 5809-5814; Moehle et al. Proc. Natl. Acad. Sci. USA, 2007, 104: 30553060; Urnov et al., Nature, 2005, 435: 646-651; and Lombardo et al. Nat. Biotechnol., 2007, 25: 1298-1306. Those skilled in the art appreciate that methods for cell culture are known in the art and may vary depending on the type of cell. Routine optimization can be used, in all cases, to determine the best techniques for a particular cell type. (c) Modification of the target genome
[0098] The binding of one or more programmable DNA binding proteins to the proximal sequence (s) to the target chromosomal sequence changes the local chromatin configuration, for example, nucleosome structure can be altered and / or histones can be displaced . As a consequence, the programmable DNA modification protein is able to better access the target chromosomal sequence when compared to when the programmable DNA modification protein is used alone. The increased accessibility results in greater efficiency and / or specificity in modifying the target genome. The modification of the target genome / epigenetics can be mediated by DNA modification proteins that have nuclease activity or non-nuclease activity.
[0099] In modalities in which the programmable DNA modification protein has nuclease activity, the DNA modification protein can introduce a double strand break in the target chromosomal sequence. The double-strand break in the chromosomal sequence can be repaired by a non-homologous end joint repair process (NHEJ). Why the NHEJ is prone to errors, deletions of at least one nucleotide, insertions of at least one nucleotide, substitutions of at least one nucleotide, or combinations thereof, can occur during the repair of the disruption. Therefore, the target chromosomal sequence can be modified or inactivated. For example, a deletion, insertion or substitution in the displacement in the reading structure of a coding sequence can lead to an altered protein product or to no protein product (which is called a "knockout"). In some interactions, the method may further comprise introducing into the cell a donor polynucleotide (see below) comprising a donor sequence that is flanked by a sequence having substantial sequence identity for sequences located on either side of the target chromosomal sequence, a method that during the repair of the double strand break by a homology-directed repair process (HDR) the donor sequence in the donor polynucleotide can be exchanged with or integrated into the chromosomal sequence in the target chromosomal sequence. The integration of an exogenous sequence is called "activation". As detailed above, the methods described here also reduce the off-target effects of this method by increasing the specificity of modifying the target genome.
[00100] In various interactions, therefore, the efficiency and / or specificity of modification of the target genome can be increased by at least about 0.1 times, at least about 0.5 times, at least about 1 time, at least about 2 times, at least about 5 times, at least about 10 times, or at least about 20 times, at least about 50 times, at least about 100 times, or more than about 100 times as to when the programmable DNA modifying protein with nuclease activity is used alone. For example, the programmable DNA modification protein having nuclease activity, when used alone, cannot have any detectable indels or integration events. However, when the programmable DNA modifying protein having nuclease activity is used in combination with at least one programmable DNA binding protein, indels and integration events can be detected (for example, at least about 1% of indels / integrations, at least about 5% indels / integrations, at least about 10% indels / integrations, at least about 20% indels / integrations, at least about 30% indels / integrations, at least about 40 % indels / integrations, at least about 50% indels / integrations, or more than about 50% indels / integrations).
[00101] In modalities where the programmable DNA modification protein has non-nuclease activity, the DNA modification protein can modify DNA or associated proteins in the target chromosomal sequence or modify the expression of the target chromosomal sequence. For example, when the programmable DNA modification protein comprises epigenetic modification activity, the status of histone acetylation, methylation, phosphorylation, adenylation, etc. can be modified or the status of DNA methylation, amination, etc. can be modified. As an example, in embodiments where the programmable DNA modification protein comprises cytosine deaminase activity, one or more cytokine residues in the target chromosomal sequence can be converted to uracil residues. Alternatively, when the programmable DNA modification protein comprises transcriptional activation or repressive activity, the transcription in the target chromosomal sequence can be increased or decreased. The resulting epigenetic modification or transcriptional regulation can be increased by at least about 0.1 times, at least about 0.5 times, at least about 1 time, at least about 2 times, at least about 5 times, at least about 10 times, or at least about 20 times, at least about 50 times, at least about 100 times, or more than about 100 times when the programmable DNA modifying protein having non-nuclease activity is used alone.
[00102] The target genome modifications / epigenetic modifications detailed above can be performed alone or multiplexed (ie, two or more chromosomal sequences can be targeted simultaneously). c) Optional Donor Polynucleotide
[00103] In modalities in which the programmable DNA modification protein comprises the activity of the nuclease, the method may further comprise introducing at least one donor polynucleotide into the cell. The donor polynucleotide can be single-stranded or double-stranded, linear or circular and / or RNA or DNA. In some embodiments, the donor polynucleotide may be a vector, for example, a plasmid vector.
The donor polynucleotide comprises at least one donor sequence. In some respects, the donor sequence of the donor polynucleotide may be a modified version of an endogenous or native chromosomal sequence. For example, the donor sequence may be essentially identical to a portion of the chromosomal sequence on, or close to, the sequence targeted by the DNA modifying protein, but which comprises at least one nucleotide change. Thus, after integration or exchange with the native sequence, the sequence at the target chromosomal location comprises at least one nucleotide change. For example, the change may be an insertion of one or more nucleotides, a deletion of one or more nucleotides, a replacement of one or more nucleotides, or combinations thereof. As a consequence of the "sequence correction" integration of the modified sequence, the cell can produce a modified gene product of the target chromosomal sequence.
[00105] In other respects, the donor sequence of the donor polynucleotide may be an exogenous sequence. As used herein, an "exogenous" sequence refers to a sequence that is not native to the cell, or a sequence whose native location is at a different location in the cell's genome. For example, the exogenous sequence can comprise protein coding sequence, which can be operably linked to an exogenous promoter control sequence so that, after integration into the genome, the cell is able to express the protein encoded by the integrated sequence. Alternatively, the exogenous sequence can be integrated into the method chromosomal sequence so that its expression is regulated by an endogenous promoter control sequence. In other interactions, the exogenous sequence can be a transcriptional control sequence, another expression control sequence, an RNA coding sequence, and so on. As noted above, the integration of an exogenous sequence into a chromosomal sequence is called "activation".
[00106] As can be appreciated by those skilled in the art, the length of the donor sequence can and will vary. For example, the donor sequence can vary in length from several nucleotides to hundreds of nucleotides to hundreds of thousands of nucleotides.
[00107] Typically, the donor sequence in the donor polynucleotide is flanked by an upstream and a downstream sequence, which have substantial sequence identity for sequences located upstream and downstream, respectively, of the sequence targeted by the DNA modifying protein. programmable. Due to these sequence similarities, the upstream and downstream sequences of the donor polynucleotide allow homologous recombination between the donor polynucleotide and the target chromosomal sequence, so that the donor sequence can be integrated (or exchanged with) with the chromosomal sequence .
The upstream sequence, as used herein, refers to a nucleic acid sequence that shares substantial sequence identity with a chromosomal sequence upstream of the sequence targeted by the programmable DNA modification protein. Similarly, the downstream sequence refers to a nucleic acid sequence that shares substantial sequence identity with a chromosomal sequence downstream of the sequence targeted by the programmable DNA modification protein. As used herein, the phrase "substantial sequence identity" refers to sequences having at least about 75% sequence identity. Thus, the upstream and downstream sequences in the donor polynucleotide can be about 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% of sequence identity with upstream sequence or downstream of the target sequence. In an exemplary embodiment, the upstream and downstream sequences in the donor polynucleotide can have about 95% or 100% sequence identity with upstream or downstream chromosomal sequences relative to the sequence targeted by the programmable DNA modification protein.
[00109] In some embodiments, the upstream sequence shares substantial sequence identity with a chromosomal sequence located immediately upstream of the sequence targeted by the programmable DNA modification protein. In other embodiments, the upstream sequence shares substantial sequence identity with a chromosomal sequence that is located within about one hundred (100) nucleotides upstream from the target sequence. Thus, for example, the upstream sequence can share substantial sequence identity with a chromosomal sequence that is located about 1 to about 20, about 21 to about 40, about 41 to about 60, about 61 to about 80 or about 81 to about 100 nucleotides upstream of the target sequence. In some embodiments, the downstream sequence shares a substantial sequence identity with a chromosomal sequence located immediately downstream of the sequence targeted by the programmable DNA modification protein. In other embodiments, the downstream sequence shares substantial sequence identity with a chromosomal sequence that is located within about one hundred (100) nucleotides downstream from the target sequence. Thus, for example, the downstream sequence can share substantial sequence identity with a chromosomal sequence that is located about 1 to about 20, about 21 to about 40, about 41 to about 60, about 61 to about 80 or about 81 to about 100 nucleotides downstream of the target sequence.
Each sequence upstream or downstream can vary in length from about 20 nucleotides to about 5000 nucleotides. In some embodiments, the upstream and downstream sequences may comprise about 50, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700 , 1800, 1900, 2000, 2100, 2200, 2300, 2400, 2500, 2600, 2800, 3000, 3200, 3400, 3600, 3800, 4000, 4200, 4400, 4600, 4800 or 5000 nucleotides. In specific embodiments, the upstream and downstream sequences can vary in length from about 50 to about 1500 nucleotides. (e) Cell types
[00111] A variety of cells are suitable for use in the methods described herein. In general, the cell is a eukaryotic cell. For example, the cell can be a human mammalian cell, a non-human mammalian cell, a non-mammalian vertebrate cell, an invertebrate cell, an insect cell, a plant cell, a yeast cell, or a single cell eukaryotic organism . In some embodiments, the cell can also be a cellular embryo. For example, a non-human mammal embryo including embryos of rat, hamster, rodent, rabbit, feline, canine, sheep, porcine, bovine, equine and primate. In yet other embodiments, the cell can be a stem cell, such as embryonic stem cells, ES-like stem cells, fetal stem cells, adult stem cells, and the like. In one embodiment, the stem cell is not a human embryonic stem cell. In addition, stem cells can include those produced by the techniques described in WO 2003/046141, which is incorporated herein in its entirety, or Chung et al. (Cell Stem Cell, 2008, 2: 113-117). The cell can be in vitro or in vivo (ie, within an organism). In exemplary embodiments, the cell is a mammalian cell. In particular embodiments, the cell is a human cell.
[00112] Non-limiting examples of suitable mammalian cells include human embryonic kidney cells (HEK293, HEK293T); human cervical carcinoma cells (HELA); human lung cells (W138); human liver cells (Hep G2); human osteosarcoma cells U2-OS, human cells A549, human cells A-431 and human cells K562; Chinese hamster ovary cells (CHO), baby hamster kidney cells (BHK); mouse myeloma NS0 cells, mouse embryonic fibroblast 3T3 cells (NIH3T3), mouse B lymphoma A20 cells; mouse melanoma B16 cells; mouse myoblasts C2C12 cells; mouse myeloma SP2 / 0 cells; mouse embryonic mesenchymal C3H-10T1 / 2 cells; mouse carcinoma CT26 cells, mouse prostate DuCuP cells; mouse breast EMT6 cells; mouse hepatoma Hepa1c1c7 cells; mouse myeloma J5582 cells; mouse epithelial MTD-1A cells; mouse myocardial MyEnd cells; mouse renal cells; mouse pancreatic RIN-5F cells; mouse melanoma X64 cells; mouse lymphoma YAC-1 cells; 9L mouse glioblastoma cells; rat B lymphoma RBL cells; rat neuroblastoma B35 cells; rat hepatoma cells (HTC), BRL 3A buffalo rat liver cells, canine kidney cells (MDCK); canine breast cells (CMT); rat osteosarcoma D17 cells; rat monocyte / macrophage DH82 cells; monkey kidney SV-40 transformed fibroblast cells (COS7); monkey kidney CVI-76 cells; African green monkey kidney cells (VERO-76). An extensive list of mammalian cell lines can be found in the American Type Culture Collection catalog (ATCC, Manassas, VA). (IV) Methods for Detecting Specific Genomic Loci
[00113] Methods for detecting or visualizing specific genomic loci in eukaryotic cells are also provided here. Since the proximal binding of one or more programmable DNA-binding proteins alters the chromatin structure and increases the accessibility of the programmable DNA-modifying protein to previously inaccessible chromosomal locus, the method described above in section (III) can be modified to enhance the detection of specific genomic loci or target chromosomal sequences. The method comprises introducing into the eukaryotic cell (a) a programmable DNA binding protein, having at least one detectable marker or nucleic acid encoding the programmable DNA binding protein, comprising at least one detectable marker domain and (b) at least a programmable DNA binding protein or nucleic acid encoding at least one programmable DNA binding protein, wherein the programmable DNA binding protein comprising at least one detectable marker domain is directed to a target chromosomal sequence and each or more Programmable DNA binding proteins are directed to a site proximal to the target chromosomal sequence. Binding of at least one programmable DNA-binding protein to the site proximal to the target chromosomal sequence increases the accessibility of the programmable DNA-binding protein comprising at least one detectable marker domain to the target chromosomal sequence. The method further comprises the detection of the programmable DNA-binding protein comprising at least one detectable marker domain linked to the target chromosomal sequence.
The programmable DNA binding protein comprising at least one detectable marker domain comprises a programmable DNA binding domain. Suitable programmable DNA binding domains are described above in section (I) (a) (vi). In specific embodiments, the programmable DNA binding domain can be a catalytically inactive CRISPR / Cas system, a catalytically inactive meganuclease, a zinc finger protein or an effector similar to a transcription activator. The at least one detectable marker domain of the programmable DNA-binding protein can be a fluorescent protein (e.g., GFP, eGFP, RFP, and the like), a fluorescent marker, or an epitope marker, (which are described in section (I ) (a) (i) above). In certain embodiments, the at least one detectable marker domain of the programmable DNA-binding protein can be a naturally occurring epitope within the programmable DNA-binding protein, so that the programmable DNA-binding protein is detected by an antibody against the programmable DNA binding protein. The programmable DNA binding protein comprising at least one detectable marker domain can further comprise at least one nuclear localization signal and / or cell penetrating domain, as described above in section (I) (a) (i). In some embodiments, the programmable DNA binding protein comprising at least one detectable marker domain may also comprise a non-nuclease modification domain (as described above in section (I) (a) (vi) above).
[00115] One or more programmable DNA binding proteins are described above in section (I) (b). In general, the at least one programmable DNA link can be a catalytically inactive CRISPR / Cas protein, a catalytically inactive meganuclease, a zinc finger protein, a transcription activator-like effector, a CRISPR / Cas nicase, a ZFN nicase , a TALEN nicase, or a meganuclease nicase.
[00116] The method also detects the programmable DNA-binding protein comprising the detectable marker domain that is linked to the target chromosomal sequence, in which the detection can be by means of dynamic living cell imaging, fluorescent microscopy, confocal microscopy, immunofluorescence , immunodetection, RNA-protein binding, protein-protein binding, and the like. The detection step can be performed on live cells or fixed cells.
[00117] In embodiments in which the method comprises detecting the structural dynamics of chromatin in living cells, the programmable DNA binding protein comprising the detectable marker domain and one or more programmable DNA binding proteins can be introduced into the cell as proteins or nucleic acids, essentially as described above in section (III) (a). In modalities where the method comprises the detection of the target chromosomal sequence in fixed cells, the programmable DNA binding protein comprising the detectable marker domain and the programmable DNA binding proteins can be introduced into the cell as proteins (or RNA-complexes). protein). Methods for fixing and permeabilizing cells are well known in the art. In some embodiments, fixed cells can be subjected to chemical and / or thermal denaturation processes to convert double-stranded chromosomal DNA into single-stranded DNA. In other embodiments, the fixed cells are not subjected to chemical and / or thermal denaturation processes.
[00118] In specific embodiments, the programmable DNA binding protein comprising the detectable marker domain is a fusion protein comprising a catalytically inactive (or dead) CRISPR / Cas protein and a fluorescent protein marker domain, and at least one protein Programmable DNA binding is a catalytically inactive (or dead) CRISPR / Cas protein.
[00119] In embodiments in which at least one of the programmable DNA-modifying or DNA-binding proteins comprises a CRISPR / Cas protein, the guide RNA may also comprise a detectable marker for in situ detection (for example, FISH or CISH) . Detectable markers are detailed above in section (I) (a) (i). In some embodiments, each of the programmable and DNA-binding DNA modification proteins comprises a CRISPR / Cas protein and each guide RNA comprises at least one detectable marker, thereby increasing the amount or intensity of the signal to be detected.
[00120] In yet other modalities, the proximally linked programmable DNA modification protein and one or more programmable DNA binding proteins can be detected by means of a proximal binding assay. For example, the programmable DNA modifying protein can be linked by a first antibody and at least one of the programmable DNA binding proteins can be linked by a second antibody, each of which is linked, directly or indirectly (for example, secondary antibodies), to a single-strip proximity detection oligonucleotide. In other embodiments, the single-strand proximity detection oligonucleotide (s) can be linked, directly or indirectly, to the guide RNA (s). In yet other embodiments, the single-stranded proximity detection oligonucleotide (s) can be linked, directly or indirectly, to programmable DNA modification binding or programmable DNA binding proteins. Proximity detection oligonucleotides, which are complexed with chromosomally bound proteins, located proximally, can be detected by means of an in situ proximity dependent amplification reaction. The proximity-dependent amplification reaction in situ can be a proximity linkage assay (PLA, see Soderg et al., Nature Methods, 2006, 3 (12): 995-1000) or a proximity-dependent start of the chain reaction of hybridization (proxHCR, see Koos et al., Nature Communications, 2015, 6: 7294, 10 pp.). (V) Applications
[00121] The compositions and methods described herein can be used in a variety of therapeutic, diagnostic, industrial and research applications. In some embodiments, the present invention can be used to modify any chromosomal sequence of interest in a cell, animal or plant in order to model and / or study the function of genes, to study the genetic or epigenetic conditions of interest or to study the biochemical pathways involved in various diseases or disorders. For example, transgenic organisms can be created that model diseases or disorders, in which the expression of one or more nucleic acid sequences associated with a disease or disorder is altered. The disease model can be used to study the effects of mutations on the organism, to study the development and / or progression of the disease, to study the effect of a pharmaceutically active compound on the disease and / or to evaluate the effectiveness of a potential treatment strategy. gene therapy.
[00122] In other modalities, compositions and methods can be used to perform effective and inexpensive functional genomic assessments, which can be used to study the function of genes involved in a particular biological process and how any change in gene expression can affect the biological process, or to perform mutagenesis by saturation or deep scanning of genomic loci in conjunction with a cell phenotype. Saturation or deep scan mutagenesis can be used to determine minimal critical characteristics and discrete vulnerabilities of the functional elements necessary for gene expression, drug resistance and disease reversal, for example.
[00123] In other embodiments, the compositions and methods described herein can be used for diagnostic tests to establish the presence of a disease or disorder and / or for use in determining treatment options. Examples of suitable diagnostic tests include the detection of specific mutations in cancer cells (for example, specific mutation in EGFR, HER2 and the like), detection of specific mutations associated with particular diseases (for example, trinucleotide repeats, associated β-globin mutations with sickle cell disease, specific SNPs, etc.), detection of hepatitis, detection of viruses (for example, Zika) and so on.
[00124] In additional embodiments, the compositions and methods described herein can be used to correct genetic mutations associated with a particular disease or disorder, such as, for example, correct mutations in the globin gene associated with sickle cell disease or thalassemia, correct mutations in the gene adenosine deaminase associated with severe combined immunodeficiency (SCID), reduces the expression of HTT, the gene that causes Huntington's disease, or corrects mutations in the rhodopsin gene for the treatment of retinitis pigmentosa. Such modifications can be made in cells ex vivo.
[00125] In yet other embodiments, the compositions and methods described herein can be used to generate crop plants with improved characteristics or increased resistance to environmental stresses. The present invention can also be used to generate farm animals with improved characteristics or farm animals. For example, pigs have many characteristics that make them attractive as biomedical models, especially in regenerative medicine or xenotransplantation. DEFINITIONS
[00126] Unless otherwise defined, all technical and scientific terms used here have the meaning commonly understood by a person versed in the technique to which this invention belongs. The following references provide an ability with a general definition of many of the terms used in this invention: Singleton et al., Dictionary of Microbiology and Molecular Biology (2 ed. 1994); The Cambridge Dictionary of Science and Technology (ed. Walker, 1988); The Glossary of Genetics, 5th Ed., R. Rieger et al. (eds.), Springer Verlag (1991); and Hale & Marham, The Harper Collins Dictionary of Biology (1991). As used here, the following terms have the meanings assigned to them, unless otherwise specified.
[00127] When introducing elements of the present invention or the referred modalities thereof, the articles "one, one, one, one" ,, "o, a, os, as" and "said" are intended to mean that there is a or more of the elements. The terms "comprising", "including" and "having" are intended to be inclusive and mean that there may be additional elements in addition to the elements listed.
[00128] The term "about" when used in relation to a numerical value, x, for example, means x ± 5%.
[00129] As used herein, the terms "complementary" or "complementarity" refer to the association of double-stranded nucleic acids by base pairing through specific hydrogen bonds. The base pairing can be standard Watson-Crick base pairing (e.g., 5'-AGT C-3 'pairs with the complementary 3'-T C A G-5' sequence). The base pairing can also be Hoogsteen or reverse Hoogsteen hydrogen bonding. Complementarity is typically measured in relation to a duplex region and therefore excludes projections, for example. The complementarity between two tapes in the duplex region can be partial and expressed as a percentage (for example, 70%), if only a few (for example, 70%) of the bases are complementary. The bases that are not complementary are "incompatible". Complementarity can also be complete (that is, 100%), if all the bases in the duplex region are complementary.
[00130] As used herein, the term "CRISPR / Cas system" refers to a complex comprising a CRISPR / Cas protein (i.e., nuclease, nicase, or catalytically dead protein) and a guide RNA.
[00131] The term "endogenous sequence", as used here, refers to a chromosomal sequence that is native to the cell.
[00132] As used here, the term "exogenous" refers to a sequence that is not native to the cell, or a chromosomal sequence whose native location in the cell's genome is at a different chromosomal location.
[00133] A "gene", as used here, refers to a region of DNA (including exons and introns) that encodes a gene product, as well as all regions of DNA that regulate the production of the gene product, if whether or not such regulatory sequences are adjacent to the coding and / or transcribed sequences. Therefore, a gene includes, but is not necessarily limited to, promoter sequences, terminators, translational regulatory sequences, such as ribosome binding sites and internal ribosome entry sites, enhancers, silencers, isolators, boundary elements, origins of replication , matrix binding sites and locus control regions.
[00134] The term "heterologous" refers to an entity that is not endogenous or native to the cell of interest. For example, a heterologous protein refers to a protein that is derived from or was originally derived from an exogenous source, such as an exogenously introduced nucleic acid sequence. In some cases, the heterologous protein is not normally produced by the cell of interest.
[00135] The terms "local chromatin structure" or "local chromatin configuration", as used herein, refer to the nucleosome structure and / or spacing of the histone protein and generally do not refer to the compacting of nucleosomes into chromatin fibers and heterochromatin.
[00136] The term "nicase" refers to an enzyme that cleaves a strand from a double stranded nucleic acid sequence (i.e., cleaves a double stranded sequence). For example, a nuclease with double-stranded cleavage activity can be modified by mutation and / or deletion to function as a nicase and cleaves only one strand of a double-stranded sequence.
[00137] The term "nuclease", as used herein, refers to an enzyme that cleaves both strands of a double-stranded nucleic acid sequence.
[00138] The terms "nucleic acid" and "polynucleotide" refer to a deoxyribonucleotide or ribonucleotide polymer, in linear or circular conformation, and in the form of single or double strips. For the purposes of the present invention, these terms should not be construed as limiting the length of a polymer. The terms may cover known analogs of natural nucleotides, as well as nucleotides that are modified in the base, sugar and / or phosphate moieties (e.g., phosphorothioate backbones). In general, a particular nucleotide analogue has the same base pairing specificity; that is, an analogue of A will form a base pair with T.
[00139] The term "nucleotide" refers to deoxyribonucleotides or ribonucleotides. The nucleotides can be standard nucleotides (i.e., adenosine, guanosine, cytidine, thymidine and uridine), nucleotide isomers or nucleotide analogs. A nucleotide analogue refers to a nucleotide having a modified purine or pyrimidine base or a modified ribose portion. A nucleotide analogue can be a naturally occurring nucleotide (for example, inosine, pseudouridine, etc.) or a non-naturally occurring nucleotide. Non-limiting examples of changes in the sugar or base portions of a nucleotide include the addition (or removal) of acetyl groups, amino groups, carboxyl groups, carboxymethyl groups, hydroxyl groups, methyl groups, phosphoryl groups and thiol groups, as well as substitution of the carbon and nitrogen atoms in the bases with other atoms (for example, 7-deaza purines). Nucleotide analogs also include dideoxy nucleotides, 2'-O-methyl nucleotides, blocked nucleic acids (LNA), peptide nucleic acids (PNA) and morpholinos.
[00140] The terms "polypeptide" and "protein" are used interchangeably to refer to a polymer of amino acid residues.
[00141] The term "proximal site", as used here, refers to a binding site or nucleotide sequence that is located within about 250 base pairs on each side of a target sequence in chromosomal DNA.
[00142] As used here, the term "programmable DNA modification protein" refers to a protein that is modified to bind to a specific target sequence in chromosomal DNA and that modifies the DNA or protein (s) associated with the DNA at, or close to, the target sequence.
[00143] The term "programmable DNA binding protein", as used here, refers to a protein that is modified to bind to a specific target sequence in chromosomal DNA, but the protein does not modify the DNA or protein (s ) associated with the DNA at, or close to the target sequence.
[00144] The terms "target sequence", "target chromosomal sequence" and "target site" are used interchangeably to refer to the specific sequence in the chromosomal DNA to which the programmable DNA-modifying protein is targeted, and the site in which programmable DNA modification protein modifies the DNA or protein (s) associated with the DNA.
[00145] Techniques for determining nucleic acid and amino acid sequence identity are known in the art. Typically, such techniques include determining the nucleotide sequence of the mRNA for a gene and / or determining the amino acid sequence thus encoded, and comparing these sequences with a second nucleotide or amino acid sequence. Genomic sequences can also be determined and compared in this way. In general, identity refers to an exact nucleotide-to-nucleotide or amino acid-to-amino acid match of two polynucleotide or polypeptide sequences, respectively. Two or more sequences (polynucleotide or amino acid) can be compared to determine their percentage identity. The percent identity of two sequences, whether nucleic acid or amino acid sequences, is the number of exact matches between two aligned sequences divided by the length of the shortest sequences and multiplied by 100. An approximate alignment of the nucleic acid sequences is provided by Smith and Waterman's local homology algorithm, Advances in Applied Mathematics 2: 482489 (1981). This algorithm can be applied to amino acid sequences using the scoring matrix developed by Dayhoff, Atlas of Protein Sequences and Structure, M. O. Dayhoff ed., 5 suppl. 3: 353-358, National Biomedical Research Foundation, Washington, DC, USA, and normalized by Gribskov, Nucl. Acids Res. 14 (6): 6745-6763 (1986). An exemplary implementation of this algorithm for determining the percent identity of a sequence is provided by the Genetics Computer Group (Madison, Wisconsin) in the "BestFit" utility application. Other programs suitable for calculating percent identity or similarity between sequences are generally known in the art, for example, another alignment program is BLAST, used with standard parameters. For example, BLASTN and BLASTP can be used the following standard parameters: genetic code = standard; filter = none; tape = both; incision = 60; wait = 10; Matrix = BLOSUM62; Descriptions = 50 strings; sort by = HIGH SCORE; Databases = non-redundant, translations GenBank + EMBL + DDBJ + PDB + GenBank CDS + Swiss Protein + Spupdate + PIR. Details of these programs can be found on the GenBank website.
[00146] Since several changes can be made to the cells described above and methods without departing from the scope of the invention, it is intended that all the matter contained in the description above and in the examples given below, should be interpreted as illustrative and not in a sense limiting. LISTED MODALITIES
[00147] The following listed modalities are presented to illustrate certain aspects of the present invention, and are not intended to limit its scope. 1. A composition comprising: (a) a programmable DNA modification protein or nucleic acid encoding the programmable DNA modification protein; and (b) at least one programmable DNA binding protein or nucleic acid encoding at least one programmable DNA binding protein. 2. The modality 1 composition, in which the programmable DNA modification protein is a nuclease system (Cas) (CRISPR / Cas) (CRISPR) associated with (CRISPR) / CRISPR of regularly interleaved short palindromic repeats grouped guided by RNA , a CRISPR / Cas dual nicase system, a zinc finger nuclease (ZFN), a transcriptional activator-like nuclease (TALEN), a meganuclease, a fusion protein comprising a programmable DNA binding domain linked to a domain nuclease, or a fusion protein comprising a programmable DNA-binding domain linked to a non-nuclease domain. 3. The mode 2 composition, wherein the programmable DNA binding domain of the fusion protein is a catalytically inactive CRISPR / Cas system, a catalytically inactive meganuclease, a zinc finger protein, or an effector similar to a transcription activator. 4. Mode 2 or 3 composition, wherein the non-nuclease domain of the fusion protein has acetyltransferase activity, deacetylase activity, methyltransferase activity, demethylase activity, kinase activity, phosphatase activity, ubiquitin ligase activity, deubiquitination activity, adenylation activity, desadenylation activity, SUMOylation activity, deSUMOylation activity, ribosylation activity, de-ribosylation activity, myristoylation activity, demyistoylation activity, citrullination activity, helicase activity, amination, deamination activity, alkylation activity, dealkylation activity, oxidation activity, transcription activation activity, or transcriptional repressor activity. 5. Composition of modality 4, in which the non-nuclease domain of the fusion protein has cytosine deaminase activity, histone acetyltransferase activity, transcriptional activation activity or transcriptional repressor activity. 6. The composition of any of modalities 1 to 5, wherein the at least one programmable DNA-binding protein is a catalytically inactive CRISPR / Cas protein, a catalytically inactive meganucleasse, a zinc finger protein, an activator-like effector of transcription, a CRISPR / Cas nicase, a ZFN nicase, a TALEN nicase, or a meganuclease nicase. 7. The composition of any of embodiments 1 to 6, wherein the nucleic acid encoding the programmable DNA modification protein and at least one programmable DNA binding protein is RNA or DNA and / or wherein said nucleic acid it is part of a plasmid vector or a viral vector. 8. The composition of any of modalities 1 to 6, wherein the programmable DNA modification protein is a CRISPR / Cas nuclease system, a CRISPR / Cas dual nicase system or a catalytically inactive CRISPR / Cas system attached to a non-nuclease domain, and at least one programmable DNA-binding protein, is a catalytically inactive CRISPR / Cas system, where each CRISPR / Cas system comprises a CRISPR / Cas protein and a guide RNA. 9. The composition of modality 8, in which each CRISPR / Cas nuclease system is a CRISPR / Cas type I system, a CRISPR / Cas type II system, a CRISPR / Cas type III system or a CRISPR / Cas type V system. 10. The composition of modality 9, in which each CRISPR / Cas nuclease system is a CRISPR / Cas type II system or a CRISPR / Cas type V system. 11. The composition of any one of modalities 8 to 10, in which the nucleic acid encoding each CRISPR / Cas protein is mRNA or DNA. 12. The composition of any of embodiments 8 to 11, wherein the nucleic acid encoding each CRISPR / Cas protein and / or nucleic acid encoding each guide RNA is part of a plasmid vector or a viral vector. 13. The composition of any of modalities 8 to 11, in which the guide RNA of each CRISPR / Cas system is enzymatically synthesized. 14. The composition of any of the modalities 8 to 11, in which the guide RNA of each CRISPR / Cas system is at least partially chemically synthesized. 15. A Kit comprising the composition of any one of modalities 1 to 14. 16. A method to increase the efficiency and / or specificity of modification of the target genome in a eukaryotic cell, the method comprising the introduction into the eukaryotic cell: (a ) a programmable DNA modification protein or nucleic acid encoding the programmable DNA modification protein and; (b) at least one programmable DNA-binding protein or nucleic acid encoding at least one programmable DNA-binding protein; wherein the programmable DNA modification protein is directed to a target chromosomal sequence and each at least one programmable DNA binding protein is directed to a site proximal to the target chromosomal sequence and the binding of at least one programmable DNA binding protein to the site proximal to the target chromosomal sequence increases the accessibility of the programmable DNA modification protein to the target chromosomal sequence, thereby increasing the efficiency and / or specificity of modifying the target genome. 17. Method of modality 16, in which the site proximal to the target chromosomal sequence is located within about 250 base pairs on each side of the target chromosomal sequence. 18. The method of modality 17, in which the site proximal to the target chromosomal sequence is located within about 100 base pairs on either side of the target chromosomal sequence. 19. Method 18 method, where the site proximal to the target chromosomal sequence is located within about 75 base pairs on each side of the target chromosomal sequence. 20. Method 19 method, where the site proximal to the target chromosomal sequence is located within about 50 base pairs on each side of the target chromosomal sequence. 21. The method of modality 20, in which the site proximal to the target chromosomal sequence is located within about 25 base pairs on each side of the target chromosomal sequence. 22. Method of any of the modalities 16 to 21, in which the programmable DNA modification protein is a CRISPR / Cas nuclease system, a CRISPR / Cas dual nicase system, a zinc finger nuclease (ZFN), a nuclease transcription activator-like effector (TALEN), a meganuclease, a fusion protein comprising a programmable DNA-binding domain linked to a nuclease domain, or a fusion protein comprising a programmable DNA-binding domain linked to a non-domain nuclease. 23. Method 22 method, wherein the programmable DNA binding domain of the fusion protein is a catalytically inactive CRISPR / Cas system, a catalytically inactive meganuclease, a zinc finger protein or an effector similar to a transcription activator. 24. Method 22 or 23 method, wherein the non-nuclease modification domain of the fusion protein has acetyltransferase activity, deacetylase activity, methyltransferase activity, demethylase activity, kinase activity, phosphatase activity, ubiquitin ligase activity , deubiquitination activity, adenylation activity, desadenylation activity, SUMOylation activity, desSUMO-ilation activity, ribosylation activity, de-ribosylation activity, myristoylation activity, de-demistoylation activity, citrullination activity, helicase activity, amination activity , deamination activity, alkylation activity, dealkylation activity, oxidation activity, transcriptional activation activity, or transcriptional repressor activity. 25. Method 24 method, wherein the non-nuclease domain of the fusion protein has cytosine deaminase activity, histone acetyltransferase activity, transcriptional activation activity or transcriptional repressor activity. 26. The method of any of the modalities 16 to 25, in which at least one programmable DNA-binding protein, is a catalytically inactive CRISPR / Cas system, a catalytically inactive meganuclease, a zinc finger protein, an effector similar to transcription activator, a CRISPR / Cas nicase, a ZFN nicase, a TALEN nicase, or a meganuclease nicase. 27. The method of any of the modalities 16 to 26, wherein the programmable DNA modification protein is a CRISPR / Cas nuclease system, a CRISPR / Cas dual nicase system or a catalytically inactive CRISPR / Cas system attached to a non-nuclease domain, and at least one programmable DNA-binding protein is a catalytically inactive CRISPR / Cas system, where each CRISPR / Cas system comprises a CRISPR / Cas protein and a guide RNA. 28. Mode 27 method, in which the guide RNA of each CRISPR / Cas system is at least partially chemically synthesized. 29. The method of modality 27, in which the guide RNA of each CRISPR / Cas system is enzymatically synthesized. 30. The method of any of the modalities 16 to 29, in which the eukaryotic cell is in vitro. 31. The method of any of the modalities 16 to 29, in which the eukaryotic cell is in vivo. 32. The method of any of the modalities 16 to 31, in which the eukaryotic cell is a mammalian cell. 33. Method 32, in which the mammalian cell is a human cell. 34. Method 32, in which the mammalian cell is a non-human cell. 35. A method for detecting a chromosomal sequence in a eukaryotic cell, the method comprising: 1. introducing into the eukaryotic cell (a) a programmable DNA-binding protein comprising at least one detectable marker domain or nucleic acid encoding the a-binding protein Programmable DNA comprising at least one detectable marker domain; and (b) at least one programmable DNA-binding protein or nucleic acid encoding at least one programmable DNA-binding protein, wherein the programmable DNA-binding protein comprising at least one detectable marker domain is directed to a chromosomal sequence target and each of at least one programmable DNA-binding protein is directed to a site proximal to the target chromosomal sequence, and binding of at least one programmable DNA-binding protein to the site proximal to the target chromosomal sequence increases the accessibility of the protein programmable DNA binding comprising at least one detectable marker domain for the target chromosomal sequence; and 11. detecting the programmable DNA binding protein comprising at least one detectable marker domain linked to the target chromosomal sequence. 36. The method of modality 35, in which the site proximal to the target chromosomal sequence is located within about 250 base pairs on each side of the target chromosomal sequence. 37. Method 36 method, wherein the site proximal to the target chromosomal sequence is located within about 100 base pairs on either side of the target chromosomal sequence. 38. Method 37 method, wherein the site proximal to the target chromosomal sequence is located within about 75 base pairs on each side of the target chromosomal sequence. 39. Method 38 method, wherein the site proximal to the target chromosomal sequence is located within about 50 base pairs on each side of the target chromosomal sequence. 40. Mode 39 method, where the site proximal to the target chromosomal sequence is located within about 25 base pairs on each side of the target chromosomal sequence. 41. The method of any of embodiments 35 to 40, wherein the at least one detectable marker domain of the programmable DNA binding protein, comprising at least one detectable marker domain is a fluorescent protein, a fluorescent marker, an epitope marker , or a naturally occurring epitope within the programmable DNA binding protein. 42. The method of any of the modalities 35 to 41, wherein the programmable DNA binding protein, comprising at least one detectable marker domain is a catalytically inactive CRISPR / Cas system linked to at least one detectable marker domain, a catalytically meganuclease inactive linked to at least one detectable marker domain, a zinc finger protein linked to at least one detectable marker domain, or a transcription activator-like effector linked to at least one detectable marker domain. 43. The method of any of modalities 35 to 42, wherein the at least one programmable DNA-binding protein is a catalytically inactive CRISPR / Cas system, a catalytically inactive meganucleasse, a zinc finger protein, an activator-like effector of transcription, a CRISPR / Cas nicase, a ZFN nicase, a TALEN nicase, or a meganuclease nicase. 44. The method of any of modalities 35 to 43, wherein the programmable DNA binding protein, comprising at least one detectable marker domain is a catalytically inactive CRISPR / Cas system bound to at least one detectable marker domain and at least a programmable DNA-binding protein is a catalytically inactive CRISPR / Cas system, where each CRISPR / Cas system comprises a CRISPR / Cas protein and a guide RNA. 45. Modality method 44, in which the guide RNA of each CRISPR / Cas system is at least partially chemically synthesized. 46. The method of modality 44, in which the guide RNA of each CRISPR / Cas system is enzymatically synthesized. 47. The method of any of the modalities 35 to 46, in which the eukaryotic cell is a mammalian cell. 48. Method 47, in which the mammalian cell is a human cell. 49. The 47 method, in which the mammalian cell is a non-human cell. 50. The method of any of the modalities 35 to 49, in which the eukaryotic cell is living or fixed. 51. The method of any of the modalities 35 to 50, in which the detection comprises dynamic imaging of living cells, fluorescent microscopy, confocal microscopy, immunofluorescence, immunodetection, RNA-protein binding or protein-protein binding. EXAMPLES
[00148] The following examples illustrate certain aspects of the invention. Example 1. Genella enhancement Francisella novicida CRISPR-Cas9 (FnCas9)
[00149] FnCas9 is a CRISPR-Cas9 type IIB. It exhibits greater intrinsic specificity than the widely used SpCas9, but has been found to be less robust than SpCas9 in human cells. To determine whether binding of programmable DNA-binding proteins to proximal sites could enable the nuclease to cleave a target from another inaccessible method (ie, POR locus) in human cells, K562 cells were transfected with 5.6 μg of plasmid DNA FnCas9, 5 μg of catalytically killed SpCas9 plasmid DNA (SpdCas9) and 3 μg of plasmid DNA from each sgRNA per one million cells (see FIG. 2). Genomic DNA was collected 3 days after transfection and the target region was amplified by PCR with the 5'-CTCCCCTGCTTCTTGTCGTAT-3 'forward primer (SEQ ID NO: 9) and the 5'- ACAGGTCGTGGACACTCACA-3' forward primer (SEQ ID NO: 10). FnCas9-targeted insertions / deletions in the target were determined by Cel-I nuclease digestion and polyacrylamide gel analysis.
[00150] As shown in FIG. 2, FnCas9 was unable to cleave the target when transfected alone. However, when it was transfected in combination with SpdCas9 to help disrupt the local chromatin configuration, FnCas9 was able to cleave the target at robust levels, with 10-11% of indels, when SpdCas9 was used to connect a proximal site. When SpdCas9 was used to link two proximal sites, the activity of FnCas9 also increased to 28% of the indels. These results demonstrate that the method described here can enable an endonuclease to cleave a target efficiently from another inaccessible method, and there is a synergistic effect between two sites used to disrupt the local chromatin configuration. Example 2. Highlight on the editing of the Campylobacter jejuni CRISPR-Cas9 gene (CjCas9)
[00151] CjCas9 is a CRISPR-Cas9 type IIC. It is the smallest Cas9 featured so far and has a unique ACAY PAM requirement. But the nuclease was found to be inactive on most targets in human cells. To determine whether the methods described here could enable the CjCas9 protein to bind to an inaccessible target in human cells, K562 cells were transfected with 4.2 μg of catalytically killed Flag-tagged plasmid CjCas9 (CjdCas9), 5 μg of Plasmid DNA SpCas9 (SpdCas9) catalytically killed and 3 μg of plasmid DNA from each sgRNA per one million cells (see FIG. 3A). The cells were fixed in formaldehyde 16 hours after transfection and chromatin immunoprecipitation (ChIP) was performed using anti-Flag antibody. Flag-CjdCas9 binding to the target was determined by digital droplet PCR (ddPCR).
[00152] As shown in FIG. 3C, Flag-CjdCas9 was able to bind to a previously known accessible target at the AAVS1 locus, but was unable to bind to an inaccessible target at the POR locus when it was transfected alone. However, when it was transfected in combination with SpdCas9 to disrupt the local chromatin configuration, Flag-CjdCas9 was able to bind the POR target even more efficiently than its binding to the AAVS1 target.
[00153] To examine the effect on target DNA cleavage, K562 cells were transfected with 4.2 μg of plasmid DNA CjCas9, 5 μg of plasmid DNA SpdCas9 and 3 μg of plasmid DNA for each million sgRNA cells. Genomic DNA was collected 3 days after transfection and the target region was amplified by PCR with the 5'-CTCCCCTGCTTCTTGTCGTAT-3 'forward primer (SEQ ID NO: 9) and the 5'- ACAGGTCGTGGACACTCACA-3' forward primer (SEQ ID NO: 10). The cleavage activity of CjCas9 on the POR target was determined by Cel-I nuclease digestion and polyacrylamide gel analysis. As shown in FIG. 4, CjCas9 was unable to cleave the target without SpdCas9. However, when transfected in combination with SpdCas9, CjCas9 was able to cleave the target efficiently with 34.1-37.9% of indels. These results demonstrate that the method described here can enable a nuclease to efficiently bind and cleave an otherwise inaccessible target. Example 3. Enhancement of the Francisella novicida Cpf1 (FnCpfl) gene
[00154] FnCpf1 is a CRISPR-Cas type V system. Cpf1 systems are significantly divergent from CRISPR-Cas9 type II systems. Unlike Cas9 systems, Cpf1 systems use a 5 'T-rich PAM and a single guide RNA for targeting without a tracrRNA (Zetsche et al., Cell, 2015, 163: 1-13). These "newer" CRISPR systems have the potential to make the practice of gene editing even simpler, however many Cpf1 systems have been found to be inactive in human cells. To determine whether the methods described here could enable the diverging "inactive" Cpf1 nuclease to cleave endogenous targets in human cells, K562 cells were transfected with 5 μg of FnCpfl plasmid DNA, 5 μg of SpdCas9 plasmid DNA and 3 μg of DNA plasmid of each sgRNA per one million cells (see FIG. 5). Genomic DNA was collected 3 days after transfection and the target region was amplified by PCR with the 5'-CTCCCCTGCTTCTTGTCGTAT-3 'forward primer (SEQ ID NO: 9) and the 5'-ACAGGTCGTGGACACTCACA-3' forward primer (SEQ ID NO: 10). FnCpf1 cleavage activity on a POR target was determined by Cel-I nuclease digestion and polyacrylamide gel analysis.
[00155] As shown in FIG. 5, FnCpf1 was unable to cleave the target when it was transfected alone, but was able to cleave the target efficiently when it was transfected in combination with SpdCas9. These results demonstrate that the method described here is applicable to divergent type V CRISPR-Cas systems. Example 4. Selective editing between identical targets in human HBB and HBD.
[00156] Two identical targets in humans (ie, HBB and HBD) were used to determine whether the methods disclosed here could facilitate selective editing between identical sites in different genes. K562 cells were transfected with 4.2 μg of plasmid DNA CjCas9, 5 μg of plasmid DNA SpdCas9 and 3 μg of plasmid DNA from each sgRNA per one million cells (see FIG. 6). Genomic DNA was collected 3 days after transfection and the two target regions were amplified by PCR with the 5'-CGGCTGTCATCACTTAGACCTCA-3 'forward primer (SEQ ID NO: 11) and the 5'-GCAGCCTAAGGGTGGGAAAATAGA-3' forward primer (SEQ ID NO: 12) for HBB and the forward primer 5'-AGGGCAAGTTAAGGGAAT AGTGGAA-3 '(SEQ ID NO: 13) and the reverse primer 5'-CCAAGGGTA GACCACCAGTAATCTG-3' (SEQ ID NO: 14) for HBD. The cleavage activity of CjCas9 on HBB and HBD targets was determined by Cel-I nuclease digestion and polyacrylamide gel analysis.
[00157] As shown in FIG. 6, when transfected alone, CjCas9 was unable to cleave any of the targets. However, when transfected in combination with SpdCas9 targeted to sites proximal to HBB, CjCas9 cleaved the HBB target efficiently, but was still unable to cleave the identical HBD target. The two bands of Cel-I nuclease digestion in the first two columns were caused by SNPs present in the K562 cell population. These results demonstrate the unique ability of the described method to improve selectivity of gene editing. Example 5. Enhancement of CRISPR-Cas9 Streptococcus pyogenes (SpCas9) gene editing
[00158] SpCas9 is a CRISPR-Cas9 of type IIA and has been widely used in the modification of the genome due to its robust activity in eukaryotic cells. However, their activity can also vary widely from target to target. To determine whether the methods described here could also improve this nuclease, K562 cells were transfected with 5 μg of plasmid DNA from SpCas9, 5.6 μg of catalytically killed FnCas9 (FndCas9) and 3 μg of plasmid DNA from each sgRNA for one million cells (see figure 7). Genomic DNA was collected 3 days after transfection and the target region was amplified by PCR with the 5'-CTCCCCTGCTT CTTGTCGTAT-3 'forward primer (SEQ ID NO: 9) and the 5'- ACAGGTCGTGGACACTCACA-3' forward primer (SEQ ID NO: 10). The SpCas9 cleavage activity in the POR target was determined by Cel-I nuclease digestion and polyacrylamide gel analysis.
[00159] As shown in FIG. 7, the SpCas9 cleavage activity increased significantly when it was transfected in combination with FndCas9, compared to when it was transfected alone. These results show that the method described here can also be applied to robust endonucleases. Example 6. Enhancement of gene editing using ssDNA oligo donor
[00160] K562 cells were transfected with 4.2 μg of plasmid DNA CjCas9, 5 μg of plasmid DNA from SpdCas9, 3 μg of plasmid DNA from each sgRNA and 300 pmol of an 88-nt ssDNA oligo donor for targeted integration of one EcoRI restriction site per one million cells. Genomic DNA was collected 3 days after transfection and the target region was amplified by PCR with the 5'-CTCCCCTGCTTCTTGTCGTAT-3 'forward primer (SEQ ID NO: 9) and the 5'-ACAGGTCGTGGACACTCACA-3' forward primer (SEQ ID NO : 10). The targeted integration of the EcoRI restriction site was determined by digestion with EcoRI restriction enzyme and polyacrylamide gel analysis. As shown in FIG. 8, the restriction site was efficiently integrated (28-37%) in the POR locus when the oligo donor ssDNA was transfected together with CjCas9 and SpdCas9, while no integration was detected when the donor oligo was transfected alone or in combination with CjCas9 without SpdCas9. These results demonstrate that the method described here can facilitate efficient gene editing using an ssDNA oligo donor in a target of another inaccessible method. Example 7. Enhancement of detection of sequence-specific genomic DNA in living, fixed cells.
[00161] The fusion of Cas9 proteins to fluorescent proteins made it possible to detect chromosomal dynamics in living cells (Chen et al., Cell, 2013, 155: 1479-91). Therefore, it is believed that the structural dynamics of chromatin will influence the ability of complexes in the CRISPR / Cas system to access various genomic loci. Thus, it is believed that the placement of CRISPR complexes (dCas9) close to those housing dCas9-GFP enhances the detection of chromosomal dynamics to an extent similar to that observed in Example 2 for chromatin immunoprecipitation. For example, CjdCas9 can be fused to GFP and targeted to a region with a chromatin state that prevents the detectable binding of CjdCas9-GFP. The SpdCas9-based system can then be designed in proximity to CjdCas9-GFP targets to produce a detectable signal. For chromatin regions that are resistant to the binding and detection of SpdCas9-GFP, a proximal FndCas9 molecule can be used to enhance detection to an extent similar to that shown in Example 5 for proximal targeting of SpCas9 and FndCas9 and enhancement of the disruption activity of double ribbon. In addition, given that previous studies have indicated that the extent of hybridization requirements between CRISPR guide RNA and genomic DNA may be less for binding than for double-stranded cleavage (Wu et al., Nature Biotechnology, 2014, 32 ( 7): 670-6), it is believed that the use of proximal CRISPR binding increases signal-to-noise ratios for the detection of genomic DNA in cells.
[00162] Similar methods of detection based on CRISPR have been applied to fixed cells (Deng et al., Proc. Natl. Acad, Sci. USA, 2015, 112 (38): 11870-75). Thus, it is believed that the proximal CRISPR targeting will enhance the detection of fixed DNA in a method similar to that described above for living cells. Since the strands of genomic DNA in fixed cells are chemically cross-linked, interrogation of sequence information by hybridizing nucleic acid probes typically requires a pretreatment step with thermal or chemical processing to sufficiently separate the strands. Therefore, it is possible that targeting proximal CRISPR will make fixed DNA more accessible and reduce the extent (or requirement) for thermal or chemical treatment of fixed cells. Elimination of thermal or chemical treatment would provide advantages in simplifying the diagnostic protocol and maintaining intracellular molecular structures that better reflect the biology of living cells and, therefore, more informed diagnostic results. Example 8. Enhancement of gene activation based on CRISPR and repression in eukaryotic cells.
[00163] Fusion of Cas9 proteins to transcriptional regulation domains enabled activation and repression of target gene (Konermann et al., Nature, 2014; 517 (7536): 583-8; Gilbert et al., Cell, 2014, 159 (3 ) 547-661). Structural chromatin dynamics are believed to influence the ability of the CRISPR complex to access various genomic loci and induce activation or repression. Thus, the placement of CRISPR (dCas9) complexes proximal to those housing dCas9 fused to transcriptional regulation domains is believed to enhance the regulation of target gene to an extent similar to that observed in Example 2 for chromatin immunoprecipitation. For chromatin regions that are resistant to binding and modification by SpdCas9 transcriptional regulators, a proximal FndCas9 molecule can be used to enhance gene activation or repression to an extent similar to that shown in Example 5 for proximal targeting of SpCas9 and FndCas9 and highlighting of double ribbon breaking activity. Example 9. Enhancement of epigenetic modification based on CRISPR in eukaryotic cells.
[00164] Fusion of Cas9 proteins to epigenetic modification domains enabled targeted epigenetic chromosomal modifications, such as p300 histone acetylation or cytosine deamination by cytosine deaminase (Hilton et al., Nat. Biotechnol .; 2015, 33 (5): 510-7; Komor et al., Nature, 2016, 533 (7603): 420-4. It is believed that structural chromatic dynamics will influence the CRISPR complex's ability to access various genomic loci. Thus, the placement of CRISPR complexes ( dCas9) proximal to those hosting dCas9 fused to epigenetic modifiers should highlight the targeted epigenetic modification of chromosomal DNA, local proteins, or local RNA to an extent similar to that observed in Example 2 for chromatin immunoprecipitation. modification by SpdCas9 epi-modifiers, a proximal FndCas9 molecule can be used to enhance detection to an extent similar to that shown in Example 5 for proximal targeting of SpCas9 and FndCas9 and enhancement of double strand breaking activity.
权利要求:
Claims (24)
[0001]
1. Composition, characterized by the fact that it comprises: (a) a nuclease system associated with regularly interleaved short palindromic repetitions (CRISPR) guided by RNA, or a nucleic acid that encodes said CRISPR nuclease system, in which the system A CRISPR nuclease comprises (i) a programmable DNA modification protein that is a CRISPR protein and (ii) a guide RNA; and (b) at least one catalytically inactive CRISPR system or nucleic acid encoding said at least one catalytically inactive CRISPR system, wherein each catalytically inactive CRISPR system comprises (i) a programmable DNA binding protein that is a catalytically CRISPR protein inactive and (ii) a guide RNA; and where (c) the programmable DNA modification protein is a type II CRISPR protein and at least one programmable DNA binding protein is a type II CRISPR protein, or (d) the programmable DNA modification protein is a type V CRISPR protein and at least one programmable DNA binding protein is a type II CRISPR protein; and wherein the composition as defined above is not disclosed in WO2017 / 070598 or WO2017 / 096328.
[0002]
2. Composition, according to claim 1, characterized by the fact that: (i) CRISPR type II protein is selected from Francisella novicida CRISPR-Cas9 (FnCas9), Campylobacter jejuni CRISPR-Cas9 (CjCas9) and Streptococcus CRISPR-Cas9 pyogenes (SpCas9); and / or (ii) the CRISPR type V protein is Francisella novicida CRISPR-Cpf1 (FnCpf1).
[0003]
3. Composition according to claim 1 or 2, characterized by the fact that at least one programmable DNA-binding protein is a CRISPR Cas9 type II protein, in which the Cas9 protein has one or more mutations in each domain of the RuvC type and the HNH type domain.
[0004]
4. Composition according to claim 3, characterized by the fact that: (i) the one or more mutations in the RuvC-type domain are D10A, D8A, E762A and / or D986A; and (ii) the one or more mutations in the HNH type domain are H840A, H559A, N854A, N856A and / or N863A.
[0005]
Composition according to any one of claims 1 to 4, characterized in that the nucleic acid that encodes each CRISPR protein is mRNA or DNA.
[0006]
Composition according to any one of claims 1 to 5, characterized in that the nucleic acid encoding each CRISPR protein is part of a plasmid vector or a viral vector.
[0007]
Composition according to any one of claims 1 to 6, characterized in that the nucleic acid encoding each guide RNA is part of a plasmid vector or a viral vector.
[0008]
8. Composition according to any one of claims 1 to 7, characterized in that the guide RNA of each CRISPR system is: (a) enzymatically synthesized or (b) is at least partially chemically synthesized.
[0009]
9. Kit, characterized by the fact that it comprises the composition, as defined in any one of claims 1 to 8, and instructions for using the composition in an in vitro method to increase the efficiency and / or specificity of the modification of the target genome in a eukaryotic cell.
[0010]
10. In vitro method to increase the efficiency and / or specificity of the modification of the target genome in a eukaryotic cell, the method characterized by the fact that it comprises introducing into the eukaryotic cell a composition comprising: (a) an associated nuclease system to short palindromic repetitions regularly interspersed grouped (CRISPR) guided by RNA, or a nucleic acid encoding said CRISPR nuclease system, wherein the CRISPR nuclease system comprises (i) a programmable DNA modifying protein that is a CRISPR protein and (ii) a guide RNA; and (b) at least one catalytically inactive CRISPR system or nucleic acid encoding said at least one catalytically inactive CRISPR system, wherein each catalytically inactive CRISPR system comprises (i) a programmable DNA binding protein that is a catalytically CRISPR protein inactive and (ii) a guide RNA; and where (c) the programmable DNA modification protein is a type II CRISPR protein and at least one programmable DNA binding protein is a type II CRISPR protein, or (d) the programmable DNA modification protein is a type V CRISPR protein and at least one programmable DNA binding protein is a type II CRISPR protein; wherein the programmable DNA modification protein is directed to a target chromosomal sequence and each of at least one programmable DNA binding protein is directed to a site proximal to the target chromosomal sequence and the binding of at least one protein binding to Programmable DNA at the site proximal to the target chromosomal sequence increases the accessibility of the programmable DNA modification protein to the target chromosome sequence, thereby increasing the efficiency and / or the specificity of the target genome modification.
[0011]
11. Method according to claim 10, characterized by the fact that the site proximal to the target chromosomal sequence is located within 250, 100, 75, 50 or 25 base pairs on both sides of the target chromosomal sequence.
[0012]
12. Method according to claim 10 or 11, characterized in that the eukaryotic cell is a mammalian cell, wherein the mammalian cell is a human cell or a non-human cell.
[0013]
13. Method according to any one of claims 10 to 12, characterized by the fact that: (i) the CRISPR type II protein is selected from Francisella novicida CRISPR-Cas9 (FnCas9), Campylobacter jejuni CRISPR-Cas9 ( CjCas9) and Streptococcus pyogenes CRISPR-Cas9 (SpCas9); and / or (ii) the CRISPR type V protein is Francisella novicida CRISPR-Cpfl (FnCpf1).
[0014]
14. Method according to any one of claims 10 to 13, characterized in that the at least one programmable DNA-binding protein is a CRISPR Cas9 type II protein, wherein the Cas9 protein has one or more mutations in each one of the RuvC-type domain and the HNH-type domain.
[0015]
15. Method according to claim 14, characterized by the fact that (i) the one or more mutations in the RuvC-type domain are D10A, D8A, E762A and / or D986A; and (ii) the one or more mutations in the HNH type domain are H840A, H559A, N854A, N856A and / or N863A.
[0016]
16. Method according to any one of claims 10 to 15, characterized in that the nucleic acid encoding each CRISPR protein is mRNA or DNA.
[0017]
17. Method according to any one of claims 10 to 16, characterized in that the nucleic acid encoding each CRISPR protein is part of a plasmid vector or a viral vector.
[0018]
18. Method according to any one of claims 10 to 17, characterized in that the nucleic acid encoding each guide RNA is part of a plasmid vector or viral vector.
[0019]
19. Method according to any one of claims 10 to 18, characterized in that the guide RNA of each CRISPR system is (a) enzymatically synthesized or (b) is at least partially chemically synthesized.
[0020]
20. Composition according to any one of claims 1 to 8, or kit according to claim 9, characterized in that it is for use in therapy.
[0021]
21. Composition according to any one of claims 1 to 8, or kit according to claim 9, characterized by the fact that it is for use in the treatment of sickle cell disease, thalassemia, severe combined immune deficiency (SCID) , Huntington's disease or retinitis pigmentosa.
[0022]
22. Method for detecting a chromosomal sequence in a eukaryotic cell, characterized by the fact that it comprises: (I) introducing into the eukaryotic cell, (a) a programmable DNA binding protein comprising at least one detectable marker domain or nucleic acid encoding the programmable DNA binding protein comprising at least one detectable marker domain; and (b) at least one programmable DNA binding protein or nucleic acid encoding at least one programmable DNA binding protein, wherein the programmable DNA binding protein comprising at least one detectable marker domain is directed to a chromosomal sequence target and each of at least one programmable DNA binding protein is directed to a site proximal to the target chromosomal sequence, and binding of at least one programmable DNA binding protein to the site proximal to the target chromosomal sequence increases the accessibility of the protein programmable DNA binding comprising at least one detectable marker domain for the target chromosomal sequence; and (II) detecting the programmable DNA binding protein comprising at least one detectable marker domain linked to the target chromosomal sequence.
[0023]
23. Use of: (a) a nuclease system associated with regularly grouped short palindromic repeats (CRISPR) guided by RNA, or a nucleic acid encoding said CRISPR nuclease system, in which the CRISPR nuclease system comprises (i ) a programmable DNA modification protein that is a CRISPR protein and (ii) a guide RNA; and (b) at least one catalytically inactive CRISPR system or nucleic acid encoding said at least one catalytically inactive CRISPR system, wherein each catalytically inactive CRISPR system comprises (i) a programmable DNA binding protein that is a catalytically CRISPR protein inactive and (ii) a guide RNA; and where (c) the programmable DNA modification protein is a type II CRISPR protein and at least one programmable DNA binding protein is a type II CRISPR protein, or (d) the programmable DNA modification protein is a CRISPR type V protein and at least one programmable DNA binding protein is a CRISPR type II protein, characterized by the fact that it is in the preparation of a composition for use in therapy.
[0024]
24. Composition according to any one of claims 1 to 8, kit according to claim 9, method according to any of claims 10 to 19 and 22 or use according to claim 23, characterized in that: (i) the CRISPR system of subpart (a) is a CRISPR / Cas9 protein of Francisella novicide Type IIB and the at least one CRISPR system of subpart (b) is a CRISPR / Cas9 protein of Streptococcus pyogenes Type IIA; (ii) the CRISPR system of subpart (a) is a CRISPR / Cas9 protein of S. pyogenes Type IIA and the at least one CRISPR system of subpart (b) is a CRISPR / Cas9 protein of F. novicida Type IIB; (iii) the CRISPR system of subpart (a) is a CRISPR / Cas9 protein of Campylobacter jejuni Type IIC and the at least one CRISPR system of subpart (b) is a CRISPR / Cas9 protein of S. pyogenes Type IIA; or (iv) the CRISPR system of subpart (a) is a CRISPR / Cpf1 protein of F. novicide Type V and the at least one CRISPR system of subpart (b) is a CRISPR / Cas9 protein of S. pyogenes Type IIA.
类似技术:
公开号 | 公开日 | 专利标题
BR112018074531B1|2021-01-19|composition comprising a nuclease system and its use, kit, in vitro method to increase the efficiency of modifying the target genome and method for detecting a chromosomal sequence
EP3428274B1|2021-10-13|Using nucleosome interacting protein domains to enhance targeted genome modification
BR112020010479A2|2020-11-24|genetically modified cas9 systems for eukaryotic genome modification
JP2020530992A|2020-11-05|Synthetic guide RNA for CRISPR / CAS activator systems
KR20190113759A|2019-10-08|DNA plasmids for rapid generation of homologous recombinant vectors for cell line development
JP6994560B2|2022-02-21|Use of nucleosome-interacting protein domains to enhance targeted genomic modification
AU2022200851A1|2022-03-03|Using nucleosome interacting protein domains to enhance targeted genome modification
同族专利:
公开号 | 公开日
EP3907286A4|2021-11-10|
GB2552861A|2018-02-14|
IL275244A|2022-02-01|
GB2582731A|2020-09-30|
KR20190025565A|2019-03-11|
EP3604527A1|2020-02-05|
AU2017274145A1|2018-11-22|
BR112018074531A2|2019-03-19|
GB202010846D0|2020-08-26|
ES2760477T3|2020-05-14|
IL289989D0|2022-03-01|
AU2017274145B2|2020-07-23|
EP3272867B1|2019-09-25|
US20170349913A1|2017-12-07|
IL262854D0|2018-12-31|
US20190169650A1|2019-06-06|
CA3026321A1|2017-12-07|
CN109983124A|2019-07-05|
PT3604527T|2021-06-02|
JP2019517795A|2019-06-27|
LT3604527T|2021-06-25|
JP6878468B2|2021-05-26|
PL3272867T3|2020-01-31|
DK3604527T3|2021-05-31|
EP3907286A1|2021-11-10|
US20190169651A1|2019-06-06|
PT3272867T|2019-12-04|
GB2578802B8|2020-11-04|
GB2552861B|2019-05-15|
AU2021200636A1|2021-03-04|
GB2582731B|2020-12-30|
JP2021121193A|2021-08-26|
IL262854A|2020-06-30|
AU2020244497A1|2020-10-29|
GB2582731B8|2021-10-27|
GB201904166D0|2019-05-08|
ES2881355T3|2021-11-29|
PL3604527T3|2021-09-06|
GB2578802B|2020-09-16|
AU2020244497B2|2020-11-12|
DK3272867T3|2019-12-02|
EP3272867A1|2018-01-24|
IL275244D0|2020-07-30|
US10266851B2|2019-04-23|
WO2017209809A1|2017-12-07|
SG11201810003UA|2018-12-28|
GB201702743D0|2017-04-05|
EP3604527B1|2021-05-12|
GB2578802A|2020-05-27|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题

JPS501B1|1970-05-19|1975-01-06|
JP2610720B2|1991-06-24|1997-05-14|株式会社クボタ|Work machine rolling control device|
US6534261B1|1999-01-12|2003-03-18|Sangamo Biosciences, Inc.|Regulation of endogenous gene expression in cells using zinc finger proteins|
GB9915126D0|1999-06-30|1999-09-01|Imp College Innovations Ltd|Control of gene expression|
WO2001083751A2|2000-04-28|2001-11-08|Sangamo Biosciences, Inc.|Methods for binding an exogenous molecule to cellular chromatin|
AU2002360424A1|2001-11-26|2003-06-10|Advanced Cell Technology, Inc.|Methods for making and using reprogrammed human somatic cell nuclei and autologous and isogenic human stem cells|
US7972854B2|2004-02-05|2011-07-05|Sangamo Biosciences, Inc.|Methods and compositions for targeted cleavage and recombination|
JP4555292B2|2003-08-08|2010-09-29|サンガモバイオサイエンシズインコーポレイテッド|Methods and compositions for targeted cleavage and recombination|
US20050220796A1|2004-03-31|2005-10-06|Dynan William S|Compositions and methods for modulating DNA repair|
JP2009502170A|2005-07-26|2009-01-29|サンガモバイオサイエンシズインコーポレイテッド|Targeted integration and expression of foreign nucleic acid sequences|
JP5156953B2|2006-03-08|2013-03-06|国立大学法人京都大学|Nucleic acid cleaving agent|
US8481272B2|2006-08-04|2013-07-09|Georgia State University Research Foundation, Inc.|Enzyme sensors, methods for preparing and using such sensors, and methods of detecting protease activity|
BRPI0808704B1|2007-03-02|2022-01-18|Dupont Nutrition Biosciences Aps|METHOD TO GENERATE AN INITIAL CULTURE COMPRISING AT LEAST TWO BACTERIOPHAGE-RESISTANT VARIANTS, INITIATOR CULTURE AND FERMENTATION METHOD|
GB0806086D0|2008-04-04|2008-05-14|Ulive Entpr Ltd|Dendrimer polymer hybrids|
US8546553B2|2008-07-25|2013-10-01|University Of Georgia Research Foundation, Inc.|Prokaryotic RNAi-like system and methods of use|
US20100076057A1|2008-09-23|2010-03-25|Northwestern University|TARGET DNA INTERFERENCE WITH crRNA|
US9404098B2|2008-11-06|2016-08-02|University Of Georgia Research Foundation, Inc.|Method for cleaving a target RNA using a Cas6 polypeptide|
US20120192298A1|2009-07-24|2012-07-26|Sigma Aldrich Co. Llc|Method for genome editing|
WO2010075424A2|2008-12-22|2010-07-01|The Regents Of University Of California|Compositions and methods for downregulating prokaryotic genes|
EP2534163B1|2010-02-09|2015-11-04|Sangamo BioSciences, Inc.|Targeted genomic modification with partially single-stranded donor molecules|
US10087431B2|2010-03-10|2018-10-02|The Regents Of The University Of California|Methods of generating nucleic acid fragments|
SG185481A1|2010-05-10|2012-12-28|Univ California|Endoribonuclease compositions and methods of use thereof|
KR101953237B1|2010-05-17|2019-02-28|상가모 테라퓨틱스, 인코포레이티드|Novel dna-binding proteins and uses thereof|
EP3489359A1|2010-07-23|2019-05-29|Sigma Aldrich Co. LLC|Genome editing using targeting endonucleases andsingle-stranded nucleic acids|
WO2012164565A1|2011-06-01|2012-12-06|Yeda Research And Development Co. Ltd.|Compositions and methods for downregulating prokaryotic genes|
WO2013074999A1|2011-11-16|2013-05-23|Sangamo Biosciences, Inc.|Modified dna-binding proteins and uses thereof|
GB201122458D0|2011-12-30|2012-02-08|Univ Wageningen|Modified cascade ribonucleoproteins and uses thereof|
WO2013141680A1|2012-03-20|2013-09-26|Vilnius University|RNA-DIRECTED DNA CLEAVAGE BY THE Cas9-crRNA COMPLEX|
US9637739B2|2012-03-20|2017-05-02|Vilnius University|RNA-directed DNA cleavage by the Cas9-crRNA complex|
US10174331B2|2012-05-07|2019-01-08|Sangamo Therapeutics, Inc.|Methods and compositions for nuclease-mediated targeted integration of transgenes|
KR20170134766A|2012-05-25|2017-12-06|더 리젠츠 오브 더 유니버시티 오브 캘리포니아|Methods and compositions for rna-directed target dna modification and for rna-directed modulation of transcription|
CN110066775A|2012-10-23|2019-07-30|基因工具股份有限公司|Composition and application thereof for cutting target DNA|
ES2757325T3|2012-12-06|2020-04-28|Sigma Aldrich Co Llc|Modification and regulation of the genome based on CRISPR|
US8697359B1|2012-12-12|2014-04-15|The Broad Institute, Inc.|CRISPR-Cas systems and methods for altering expression of gene products|
ES2576128T3|2012-12-12|2016-07-05|The Broad Institute, Inc.|Modification by genetic technology and optimization of systems, methods and compositions for the manipulation of sequences with functional domains|
SG10201704932UA|2012-12-17|2017-07-28|Harvard College|Rna-guided human genome engineering|
CN103233028B|2013-01-25|2015-05-13|南京徇齐生物技术有限公司|Specie limitation-free eucaryote gene targeting method having no bio-safety influence and helical-structure DNA sequence|
KR20170108172A|2013-03-14|2017-09-26|카리부 바이오사이언시스 인코포레이티드|Compositions and methods of nucleic acid-targeting nucleic acids|
CN105408483A|2013-03-15|2016-03-16|通用医疗公司|Rna-guided targeting of genetic and epigenomic regulatory proteins to specific genomic loci|
US20140273230A1|2013-03-15|2014-09-18|Sigma-Aldrich Co., Llc|Crispr-based genome modification and regulation|
ES2670531T3|2013-05-29|2018-05-30|Cellectis S.A.|A method to produce an accurate DNA cleavage using the nickase activity of Cas9|
JP6621738B2|2013-06-04|2019-12-18|プレジデント アンド フェローズ オブ ハーバード カレッジ|RNA-induced transcriptional regulation|
EP3539573A1|2013-06-05|2019-09-18|Duke University|Rna-guided gene editing and gene regulation|
CN103343120B|2013-07-04|2015-03-04|中国科学院遗传与发育生物学研究所|Wheat genome site-specific modification method|
CN103388006B|2013-07-26|2015-10-28|华东师范大学|A kind of construction process of site-directed point mutation|
US20150044772A1|2013-08-09|2015-02-12|Sage Labs, Inc.|Crispr/cas system-based novel fusion protein and its applications in genome editing|
EP3110454B1|2014-02-24|2020-11-18|Sangamo Therapeutics, Inc.|Methods and compositions for nuclease-mediated targeted integration|
US20170198268A1|2014-07-09|2017-07-13|Gen9, Inc.|Compositions and Methods for Site-Directed DNA Nicking and Cleaving|
AU2015298571B2|2014-07-30|2020-09-03|President And Fellows Of Harvard College|Cas9 proteins including ligand-dependent inteins|
EP3686279A1|2014-08-17|2020-07-29|The Broad Institute, Inc.|Genome editing using cas9 nickases|
EP3186376B1|2014-08-27|2019-03-20|Caribou Biosciences, Inc.|Methods for increasing cas9-mediated engineering efficiency|
EP3188763B1|2014-09-02|2020-05-13|The Regents of The University of California|Methods and compositions for rna-directed target dna modification|
WO2016065364A1|2014-10-24|2016-04-28|Life Technologies Corporation|Compositions and methods for enhancing homologous recombination|
CA2963820A1|2014-11-07|2016-05-12|Editas Medicine, Inc.|Methods for improving crispr/cas-mediated genome-editing|
US10190106B2|2014-12-22|2019-01-29|Univesity Of Massachusetts|Cas9-DNA targeting unit chimeras|
GB201506509D0|2015-04-16|2015-06-03|Univ Wageningen|Nuclease-mediated genome editing|
US9790490B2|2015-06-18|2017-10-17|The Broad Institute Inc.|CRISPR enzymes and systems|
WO2017040511A1|2015-08-31|2017-03-09|Agilent Technologies, Inc.|Compounds and methods for crispr/cas-based genome editing by homologous recombination|
DK3350327T3|2015-10-23|2019-01-21|Caribou Biosciences Inc|CONSTRUCTED CRISPR CLASS-2-NUCLEIC ACID TARGETING-NUCLEIC ACID|
BR112018010429A2|2015-12-04|2018-11-27|Caribou Biosciences, Inc.|nucleic acids that target engineered nucleic acids|
WO2017189821A1|2016-04-29|2017-11-02|Bio-Rad Laboratories, Inc.|Dimeric proteins for specific targeting of nucleic acid sequences|EP2734621B1|2011-07-22|2019-09-04|President and Fellows of Harvard College|Evaluation and improvement of nuclease cleavage specificity|
US20150044192A1|2013-08-09|2015-02-12|President And Fellows Of Harvard College|Methods for identifying a target site of a cas9 nuclease|
US9359599B2|2013-08-22|2016-06-07|President And Fellows Of Harvard College|Engineered transcription activator-like effectordomains and uses thereof|
US9340799B2|2013-09-06|2016-05-17|President And Fellows Of Harvard College|MRNA-sensing switchable gRNAs|
US9526784B2|2013-09-06|2016-12-27|President And Fellows Of Harvard College|Delivery system for functional nucleases|
US9322037B2|2013-09-06|2016-04-26|President And Fellows Of Harvard College|Cas9-FokI fusion proteins and uses thereof|
US9068179B1|2013-12-12|2015-06-30|President And Fellows Of Harvard College|Methods for correcting presenilin point mutations|
AU2015298571B2|2014-07-30|2020-09-03|President And Fellows Of Harvard College|Cas9 proteins including ligand-dependent inteins|
CN108513575A|2015-10-23|2018-09-07|哈佛大学的校长及成员们|Nucleobase editing machine and application thereof|
US11078481B1|2016-08-03|2021-08-03|KSQ Therapeutics, Inc.|Methods for screening for cancer targets|
KR20190034628A|2016-08-03|2019-04-02|프레지던트 앤드 펠로우즈 오브 하바드 칼리지|Adenosine nucleobase editing agent and uses thereof|
US11078483B1|2016-09-02|2021-08-03|KSQ Therapeutics, Inc.|Methods for measuring and improving CRISPR reagent function|
US10745677B2|2016-12-23|2020-08-18|President And Fellows Of Harvard College|Editing of CCR5 receptor gene to protect against HIV infection|
US11268082B2|2017-03-23|2022-03-08|President And Fellows Of Harvard College|Nucleobase editors comprising nucleic acid programmable DNA binding proteins|
US10011849B1|2017-06-23|2018-07-03|Inscripta, Inc.|Nucleic acid-guided nucleases|
WO2020005383A1|2018-06-30|2020-01-02|Inscripta, Inc.|Instruments, modules, and methods for improved detection of edited sequences in live cells|
CA3066798A1|2017-07-31|2019-02-07|Sigma Aldrich Co. Llc|Synthetic guide rna for crispr/cas activator systems|
WO2019126709A1|2017-12-22|2019-06-27|The Broad Institute, Inc.|Cas12b systems, methods, and compositions for targeted dna base editing|
US20210071163A1|2017-12-22|2021-03-11|The Broad Institute, Inc.|Cas12b systems, methods, and compositions for targeted rna base editing|
WO2019200004A1|2018-04-13|2019-10-17|Inscripta, Inc.|Automated cell processing instruments comprising reagent cartridges|
WO2019209926A1|2018-04-24|2019-10-31|Inscripta, Inc.|Automated instrumentation for production of peptide libraries|
US10526598B2|2018-04-24|2020-01-07|Inscripta, Inc.|Methods for identifying T-cell receptor antigens|
US10858761B2|2018-04-24|2020-12-08|Inscripta, Inc.|Nucleic acid-guided editing of exogenous polynucleotides in heterologous cells|
US11142740B2|2018-08-14|2021-10-12|Inscripta, Inc.|Detection of nuclease edited sequences in automated modules and instruments|
US10532324B1|2018-08-14|2020-01-14|Inscripta, Inc.|Instruments, modules, and methods for improved detection of edited sequences in live cells|
AU2019326408A1|2018-08-23|2021-03-11|Sangamo Therapeutics, Inc.|Engineered target specific base editors|
CN113227368A|2018-10-22|2021-08-06|因思科瑞普特公司|Engineered enzymes|
US11214781B2|2018-10-22|2022-01-04|Inscripta, Inc.|Engineered enzyme|
EP3931313A2|2019-01-04|2022-01-05|Mammoth Biosciences, Inc.|Programmable nuclease improvements and compositions and methods for nucleic acid amplification and detection|
CN113631713A|2019-03-25|2021-11-09|因思科瑞普特公司|Simultaneous multiplex genome editing in yeast|
US11001831B2|2019-03-25|2021-05-11|Inscripta, Inc.|Simultaneous multiplex genome editing in yeast|
WO2020247587A1|2019-06-06|2020-12-10|Inscripta, Inc.|Curing for recursive nucleic acid-guided cell editing|
WO2020257395A1|2019-06-21|2020-12-24|Inscripta, Inc.|Genome-wide rationally-designed mutations leading to enhanced lysine production in e. coli|
US10927385B2|2019-06-25|2021-02-23|Inscripta, Inc.|Increased nucleic-acid guided cell editing in yeast|
WO2021102059A1|2019-11-19|2021-05-27|Inscripta, Inc.|Methods for increasing observed editing in bacteria|
WO2021118626A1|2019-12-10|2021-06-17|Inscripta, Inc.|Novel mad nucleases|
US10704033B1|2019-12-13|2020-07-07|Inscripta, Inc.|Nucleic acid-guided nucleases|
US11008557B1|2019-12-18|2021-05-18|Inscripta, Inc.|Cascade/dCas3 complementation assays for in vivo detection of nucleic acid-guided nuclease edited cells|
WO2021133977A1|2019-12-23|2021-07-01|The Broad Institute, Inc.|Programmable dna nuclease-associated ligase and methods of use thereof|
US10689669B1|2020-01-11|2020-06-23|Inscripta, Inc.|Automated multi-module cell processing methods, instruments, and systems|
US20210332388A1|2020-04-24|2021-10-28|Inscripta, Inc.|Compositions, methods, modules and instruments for automated nucleic acid-guided nuclease editing in mammalian cells|
法律状态:
2020-07-21| B06A| Patent application procedure suspended [chapter 6.1 patent gazette]|
2020-10-27| B09A| Decision: intention to grant [chapter 9.1 patent gazette]|
2021-01-19| B16A| Patent or certificate of addition of invention granted [chapter 16.1 patent gazette]|Free format text: PRAZO DE VALIDADE: 20 (VINTE) ANOS CONTADOS A PARTIR DE 20/02/2017, OBSERVADAS AS CONDICOES LEGAIS. |
优先权:
申请号 | 申请日 | 专利标题
US201662344858P| true| 2016-06-02|2016-06-02|
US62/344,858|2016-06-02|
US201662358415P| true| 2016-07-05|2016-07-05|
US62/358,415|2016-07-05|
PCT/US2017/018589|WO2017209809A1|2016-06-02|2017-02-20|Using programmable dna binding proteins to enhance targeted genome modification|
[返回顶部]